Multiphase complete exchange on Paragon, SP2, and CS-2
- 1 January 1996
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Parallel & Distributed Technology: Systems & Applications
- Vol. 4 (3) , 45-59
- https://doi.org/10.1109/88.532139
Abstract
The overhead of interprocessor communication is a major factor in limiting the performance of parallel computer systems. The complete exchange is the severest communication pattern in that it requires each processor to send a distinct message to every other processor. This pattern is at the heart of many important parallel applications. There are three main algorithms for complete exchange, all designed for hypercubes: the direct exchange, the standard exchange, and the multiphase exchange. Most contemporary commercial multicomputer systems are not hypercubes. However, through special-purpose hardware and dedicated communication processors, these systems can achieve very high performance communication and can emulate hypercubes quite well. Multiphase complete exchange, which is actually a family of algorithms with standard and direct exchange as extreme cases, performs optimally for varying message sizes. The author has implemented multiphase complete exchange on three contemporary parallel architectures: the Intel Paragon, the IBM SP2, and the Meiko CS-2. He describes the essential features of these machines and discusses their basic interprocessor communication overheads. Then he evaluates the performance of multiphase complete exchange on each machine. He discovered that the Paragon executes the multiphase well and yields smooth performance plots, with the cyclic variations in these plots stemming from memory access patterns; the SP2 exhibits enormous fluctuations in its plots because of interference from other jobs; and the CS-2 exhibits small fluctuations and the largest differences between predicted and observed timings. The author concludes that the theoretical ideas developed for hypercubes also apply to these machines and that multiphase complete exchange can lead to major savings in execution time over traditional solutions.Keywords
This publication has 12 references indexed in Scilit:
- Concurrent Bidirectional Communication On The Intel iPSC/860 And iPSC/2Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Efficient All-to-All Communication Patterns in Hypercube and Mesh TopologiesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Multiphase complete exchange: a theoretical analysisIEEE Transactions on Computers, 1996
- CCL: a portable and tunable collective communication library for scalable parallel computersIEEE Transactions on Parallel and Distributed Systems, 1995
- Optimal multiphase complete exchange on circuit-switched hypercube architecturesPublished by Association for Computing Machinery (ACM) ,1994
- An architecture for optimal all-to-all personalized communicationPublished by Association for Computing Machinery (ACM) ,1994
- Adaptive routing protocols for hypercube interconnection networksComputer, 1993
- Efficient communication primitives on hypercubesConcurrency: Practice and Experience, 1992
- Algorithms for Matrix Transposition on Boolean N-Cube Configured Ensemble ArchitecturesSIAM Journal on Matrix Analysis and Applications, 1988
- Parallel Processing with the Perfect ShuffleIEEE Transactions on Computers, 1971