Multiphase complete exchange on Paragon, SP2, and CS-2

1 January 1996

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Parallel & Distributed Technology: Systems & Applications

Vol. 4 (3) , 45-59
https://doi.org/10.1109/88.532139

Abstract

The overhead of interprocessor communication is a major factor in limiting the performance of parallel computer systems. The complete exchange is the severest communication pattern in that it requires each processor to send a distinct message to every other processor. This pattern is at the heart of many important parallel applications. There are three main algorithms for complete exchange, all designed for hypercubes: the direct exchange, the standard exchange, and the multiphase exchange. Most contemporary commercial multicomputer systems are not hypercubes. However, through special-purpose hardware and dedicated communication processors, these systems can achieve very high performance communication and can emulate hypercubes quite well. Multiphase complete exchange, which is actually a family of algorithms with standard and direct exchange as extreme cases, performs optimally for varying message sizes. The author has implemented multiphase complete exchange on three contemporary parallel architectures: the Intel Paragon, the IBM SP2, and the Meiko CS-2. He describes the essential features of these machines and discusses their basic interprocessor communication overheads. Then he evaluates the performance of multiphase complete exchange on each machine. He discovered that the Paragon executes the multiphase well and yields smooth performance plots, with the cyclic variations in these plots stemming from memory access patterns; the SP2 exhibits enormous fluctuations in its plots because of interference from other jobs; and the CS-2 exhibits small fluctuations and the largest differences between predicted and observed timings. The author concludes that the theoretical ideas developed for hypercubes also apply to these machines and that multiphase complete exchange can lead to major savings in execution time over traditional solutions.

Keywords

This publication has 12 references indexed in Scilit:

Concurrent Bidirectional Communication On The Intel iPSC/860 And iPSC/2
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Multiphase complete exchange: a theoretical analysis
IEEE Transactions on Computers, 1996
CCL: a portable and tunable collective communication library for scalable parallel computers
IEEE Transactions on Parallel and Distributed Systems, 1995
Optimal multiphase complete exchange on circuit-switched hypercube architectures
Published by Association for Computing Machinery (ACM) ,1994
An architecture for optimal all-to-all personalized communication
Published by Association for Computing Machinery (ACM) ,1994
Adaptive routing protocols for hypercube interconnection networks
Computer, 1993
Efficient communication primitives on hypercubes
Concurrency: Practice and Experience, 1992
Algorithms for Matrix Transposition on Boolean N-Cube Configured Ensemble Architectures
SIAM Journal on Matrix Analysis and Applications, 1988
Parallel Processing with the Perfect Shuffle
IEEE Transactions on Computers, 1971