Optimizing equijoin queries in distributed databases where relations are hash partitioned
- 1 May 1991
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Database Systems
- Vol. 16 (2) , 279-308
- https://doi.org/10.1145/114325.103713
Abstract
Consider the class of distributed database systems consisting of a set of nodes connected by a high bandwidth network. Each node consists of a processor, a random access memory, and a slower but much larger memory such as a disk. There is no shared memory among the nodes. The data are horizontally partitioned often using a hash function. Such a description characterizes many parallel or distributed database systems that have recently been proposed, both commercial and academic. We study the optimization problem that arises when the query processor must repartition the relations and intermediate results participating in a multijoin query. Using estimates of the sizes of intermediate relations, we show (1) optimum solutions for closed chain queries; (2) the NP-completeness of the optimization problem for star, tree, and general graph queries; and (3) effective heuristics for these hard cases. Our general approach and many of our results extend to other attribute partitioning schemes, for example, sort-partitioning on attributes, and to partitioned object databases.Keywords
This publication has 27 references indexed in Scilit:
- Optimizing join queries in distributed databasesIEEE Transactions on Software Engineering, 1988
- Set query optimization in distributed database systemsACM Transactions on Database Systems, 1986
- Join processing in database systems with large main memoriesACM Transactions on Database Systems, 1986
- Fragmentation: a technique for efficient query processingACM Transactions on Database Systems, 1986
- Optimization of join operations in horizontally partitioned database systemsACM Transactions on Database Systems, 1986
- Optimization of distributed tree queriesJournal of Computer and System Sciences, 1984
- Join and Semijoin Algorithms for a Multiprocessor Database MachineACM Transactions on Database Systems, 1984
- Tree queriesACM Transactions on Database Systems, 1982
- Query processing in a system for distributed databases (SDD-1)ACM Transactions on Database Systems, 1981
- Implementing a relational database by means of specialzed hardwareACM Transactions on Database Systems, 1979