A unified framework for optimizing communication in data-parallel programs
- 1 July 1996
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Parallel and Distributed Systems
- Vol. 7 (7) , 689-704
- https://doi.org/10.1109/71.508249
Abstract
This paper presents a framework, based on global array data-flow analysis, to reduce communication costs in a program being compiled for a distributed memory machine. We introduce available section descriptor, a novel representation of communication involving array sections. This representation allows us to apply techniques for partial redundancy elimination to obtain powerful communication optimizations. With a single framework, we are able to capture optimizations like 1) vectorizing communication, 2) eliminating communication that is redundant on any control flow path, 3) reducing the amount of data being communicated, 4) reducing the number of processors to which data must be communicated, and (5) moving communication earlier to hide latency, and to subsume previous communication. We show that the bidirectional problem of eliminating partial redundancies can be decomposed into simpler unidirectional problems even in the context of an array section representation, which makes the analysis procedure more efficient. We present results from a preliminary implementation of this framework, which are extremely encouraging, and demonstrate the effectiveness of this analysis in improving the performance of programs.Keywords
This publication has 22 references indexed in Scilit:
- Automating Parallelism of Regular Computations for Distributed-Memory Multicomputers in the Paradigm CompilerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- How to analyze large programs efficiently and informativelyPublished by Association for Computing Machinery (ACM) ,1992
- Compiling communication-efficient programs for massively parallel machinesIEEE Transactions on Parallel and Distributed Systems, 1991
- Compiling global name-space parallel loops for distributed executionIEEE Transactions on Parallel and Distributed Systems, 1991
- Data-parallel programming on multicomputersIEEE Software, 1990
- Process decomposition through locality of referencePublished by Association for Computing Machinery (ACM) ,1989
- SUPERB: A tool for semi-automatic MIMD/SIMD parallelizationParallel Computing, 1988
- Data dependence and its application to parallel processingInternational Journal of Parallel Programming, 1987
- Global optimization by suppression of partial redundanciesCommunications of the ACM, 1979
- Testing flow graph reducibilityJournal of Computer and System Sciences, 1974