Declustering of key-based partitioned signature files
- 1 September 1996
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Database Systems
- Vol. 21 (3) , 295-338
- https://doi.org/10.1145/232753.232755
Abstract
Access methods based on signature files can largely benefit from possibilities offered by parallel environments. To this end, an effective declustering strategy that would distribute signatures over a set of parallel independent disks has to be combined with a synergic clustering which is employed to avoid searching the whole signature file while executing a query. This article proposes two parallel signature file organizations, Hamming Filter ( HF ) and Hamming + Filter ( H + F ), whose common declustering strategy is based on error correcting codes , and where clustering is achieved by organizing signatures into fixed-size buckets, each containing signatures sharing the same key value. HF allocates signatures on disks in a static way and works well if a correct relationship holds between the parameters of the code and the size of the file. H + F is a generalization of HF suitable to manage highly dynamic files. It uses a dynamic declustering, obtained through a sequence of codes, and organizes a smooth migration of signatures between disks so that high performance levels are retained regardless of current file size. Theoretical analysis characterizes the best-case, expected, and worst-case behaviors of these organizations. Analytical results are verified by experiments on prototype systems.Keywords
This publication has 19 references indexed in Scilit:
- Concurrent frame signature filesDistributed and Parallel Databases, 1993
- Estimating accesses in partitioned signature file organizationsACM Transactions on Information Systems, 1993
- Optimal disk allocation for partial match queriesACM Transactions on Database Systems, 1993
- Frame-sliced signature filesIEEE Transactions on Knowledge and Data Engineering, 1992
- Dynamic partitioning of signature filesACM Transactions on Information Systems, 1991
- Partitioned signature files: design issues and performance evaluationACM Transactions on Information Systems, 1989
- Optimal signature extraction and information lossACM Transactions on Database Systems, 1987
- Parallel free-text search on the connection machine systemCommunications of the ACM, 1986
- Signature filesACM Transactions on Information Systems, 1984
- Disk allocation for Cartesian product files on multiple-disk systemsACM Transactions on Database Systems, 1982