Common Properties of Some Multiattribute File Systems
- 1 March 1979
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Software Engineering
- Vol. SE-5 (2) , 160-174
- https://doi.org/10.1109/tse.1979.234172
Abstract
This paper results from an attempt to unify several different file system design theories. We define a term "partial match pattern" and show that in order to produce file systems optimal with respect to partial match patterns, both the multikey hashing (MKH) method [16] and the multidimensional directory (MDD) method [11] must be in such a form that the number of subdivisions is the same for all domains of keys. We show the conditions for the string homomorphism hashing (SHH) method [15], the MKH method, and the MDD method to be equivalent to one another. We define the so-called Cartesian product files and show that if all records are present, the records in a Cartesian product file form a shortest spanning path in which the Hamming distance between every pair of consecutive records is 1. Thus the SHH method, the MKH method, the MDD method, and the multikey sorting (MKS) method [10] are linked together. Finally, we show that for both partial and best match queries, the file systems exhibit a common characteristic: similar records are grouped together.Keywords
This publication has 9 references indexed in Scilit:
- Multi-dimensional clustering for data base organizationsInformation Systems, 1977
- An Algorithm for Finding Best Matches in Logarithmic Expected TimeACM Transactions on Mathematical Software, 1977
- Application of Principal Component Analysis to Multikey SearchingIEEE Transactions on Software Engineering, 1976
- Partial-Match Retrieval AlgorithmsSIAM Journal on Computing, 1976
- An Algorithm for Finding Nearest NeighborsIEEE Transactions on Computers, 1975
- Multidimensional binary search trees used for associative searchingCommunications of the ACM, 1975
- A Branch and Bound Algorithm for Computing k-Nearest NeighborsIEEE Transactions on Computers, 1975
- Experiments with some cluster analysis algorithmsPattern Recognition, 1974
- Attribute based file organization in a paged memory environmentCommunications of the ACM, 1974