Dynamic load balancing in multicomputer database systems using partition tuning
- 1 January 1995
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Knowledge and Data Engineering
- Vol. 7 (6) , 968-983
- https://doi.org/10.1109/69.476502
Abstract
Shared nothing multiprocessor architecture is known to be more scalable to support very large databases. Compared to other join strategies, a hash-based join algorithm is particularly efficient and easily parallelized for this computation model. However, this hardware structure is very sensitive to the skew in tuple distribution. Unless the parallel hash join algorithm includes some dynamic load balancing mechanism, the skew effect can severely deteriorate the system performance. In this paper, we investigate this issue. In particular, three parallel hash join algorithms are presented. We implement a simulator to study the effectiveness of these schemes. The simulation model is validated by comparing the simulation results to those produced by the actual implementation of the algorithms running on a multiprocessor system. Our performance study indicates that a naive approach is not able to provide tangible savings. However, the carefully designed strategies can offer substantial improvement over conventional techniques for a wide range of skew conditions.Keywords
This publication has 18 references indexed in Scilit:
- A performance evaluation of load balancing techniques for join operations on multicomputer database systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Considering data skew factor in multi-way join query optimization for parallel executionThe VLDB Journal, 1993
- Dynamic load balancing in very large shared-nothing hypercube database computersIEEE Transactions on Computers, 1993
- The Gamma database machine projectIEEE Transactions on Knowledge and Data Engineering, 1990
- Prototyping Bubba, a highly parallel database systemIEEE Transactions on Knowledge and Data Engineering, 1990
- A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environmentACM SIGMOD Record, 1989
- A hash-based join algorithm for a cube-connected parallel computerInformation Processing Letters, 1989
- Fragmentation: a technique for efficient query processingACM Transactions on Database Systems, 1986
- Implementation techniques for main memory database systemsPublished by Association for Computing Machinery (ACM) ,1984
- Application of hash to data base machine and its architectureNew Generation Computing, 1983