Tuning a parallel database algorithm on a shared‐memory multiprocessor
- 1 July 1992
- journal article
- Published by Wiley in Software: Practice and Experience
- Vol. 22 (7) , 495-517
- https://doi.org/10.1002/spe.4380220702
Abstract
Database query processing can benefit significantly from parallelism. Parallel database algorithms combine substantial CPU and I/O activity, memory requirements, and massive data exchange between processes, all of which must be considered to obtain optimal performance. Since parallel external sorting is a very typical example, we have focused on sorting to tune Volcano, a new query processing system. The purpose of the Volcano project is to provide efficient, extensible tools for query and request processing in novel application domains, particularly in object‐oriented and scientific database systems, and for experimental database performance research. It includes all query processing algorithms conventionally used in relational database systems as well as several new ones, and can execute all of them in parallel. In this article, we present Volcano's parallel external sorting algorithm and a sequence of enhancements to improve its performance. We obtained very good absolute performance, 84 seconds for 100 MB of data, as well as near‐linear speedup with sixteen CPUs and disks. Furthermore, these results were achieved on a shared‐memory machine despite the common belief that parallel query processing is best implemented on distributed‐memory systems. We detail our tuning measures and report on their effectiveness.Keywords
This publication has 23 references indexed in Scilit:
- The Gamma database machine projectIEEE Transactions on Knowledge and Data Engineering, 1990
- The implementation of POSTGRESIEEE Transactions on Knowledge and Data Engineering, 1990
- Prototyping Bubba, a highly parallel database systemIEEE Transactions on Knowledge and Data Engineering, 1990
- Starburst mid-flight: as the dust clears (database project)IEEE Transactions on Knowledge and Data Engineering, 1990
- Merging sorted runs using large main memoryActa Informatica, 1989
- Sorting large files on a backend multiprocessorIEEE Transactions on Computers, 1988
- Join processing in database systems with large main memoriesACM Transactions on Database Systems, 1986
- Principles of database buffer managementACM Transactions on Database Systems, 1984
- Hardware Support for Advanced Data Management SystemsComputer, 1984
- A taxonomy of parallel sortingACM Computing Surveys, 1984