Heuristics for trie index minimization
- 1 September 1979
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Database Systems
- Vol. 4 (3) , 383-395
- https://doi.org/10.1145/320083.320102
Abstract
A trie is a digital search tree in which leaves correspond to records in a file. Searching proceeds from the root to a leaf, where the edge taken at each node depends on the value of an attribute in the query. Trie implementations have the advantage of being fast, but the disadvantage of achieving that speed at great expense in storage space. Of primary concern in making a trie practical, therefore, is the problem of minimizing storage requirements. One method for reducing the space required is to reorder attribute testing. Unfortunately, the problem of finding an ordering which guarantees a minimum-size trie is NP-complete. In this paper we investigate several heuristics for reordering attributes, and derive bounds on the sizes of the worst tries produced by them in terms of the underlying file. Although the analysis is presented for a binary file, extensions to files of higher degree are shown. Another alternative for reducing the space required by a trie is an implementation, called an Ο-trie, in which the order of attribute testing is contained in the trie itself. We show that for most applications, Ο-tries are smaller than other implementations of tries, even when heuristics for improving storage requirements are employed.Keywords
This publication has 10 references indexed in Scilit:
- The Complexity of Trie Index ConstructionJournal of the ACM, 1977
- Hashing and trie algorithms for partial match retrievalACM Transactions on Database Systems, 1976
- Partial-Match Retrieval AlgorithmsSIAM Journal on Computing, 1976
- Design of tree structures for efficient queryingCommunications of the ACM, 1973
- Tree Structures for Optimal SearchingJournal of the ACM, 1970
- Variable length tree structures having minimum average search timeCommunications of the ACM, 1969
- PATRICIA—Practical Algorithm To Retrieve Information Coded in AlphanumericJournal of the ACM, 1968
- Use of tree structures for processing filesCommunications of the ACM, 1963
- Trie memoryCommunications of the ACM, 1960
- File searching using variable length keysPublished by Association for Computing Machinery (ACM) ,1959