A similarity graph-based approach to declustering problems and its application towards parallelizing grid files
- 19 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 373-381
- https://doi.org/10.1109/icde.1995.380370
Abstract
We propose a new similarity-based technique for declustering data. The proposed method can adapt to available information about query distributions, data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph defined over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of data-items that are to be accessed together by queries are allocated to distinct disks. We show that the proposed method can achieve optimal speed-up for a query-set, if there exists any other declustering method which will achieve the optimal speed-up. Experiments in parallelizing grid files show that the proposed method outperforms mapping-function-based methods for interesting query distributions as well for non-uniform data distributions.<>Keywords
This publication has 12 references indexed in Scilit:
- Declustering using fractalsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Latin squares for parallel array accessIEEE Transactions on Parallel and Distributed Systems, 1993
- Disk allocation methods using error correcting codesIEEE Transactions on Computers, 1991
- An improved two-way partitioning algorithm with stable performance (VLSI)IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1991
- The Gamma database machine projectIEEE Transactions on Knowledge and Data Engineering, 1990
- Optimal file distribution for partial match retrievalACM SIGMOD Record, 1988
- Performance of two-disk partition data allocationsBIT Numerical Mathematics, 1987
- An Algorithm for Partitioning the Nodes of a GraphSIAM Journal on Algebraic Discrete Methods, 1982
- Disk allocation for Cartesian product files on multiple-disk systemsACM Transactions on Database Systems, 1982
- An Efficient Heuristic Procedure for Partitioning GraphsBell System Technical Journal, 1970