A global communication optimization technique based on data-flow analysis and linear algebra

Open Access

1 November 1999

journal article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Programming Languages and Systems

Vol. 21 (6) , 1251-1297
https://doi.org/10.1145/330643.330647

Abstract

Reducing communication overhead is extremely important in distributed-memory message-passing architectures. In this article, we present a technique to improve communication that considers data access patterns of the entire program. Our approach is based on a combination of traditional data-flow analysis and a linear algebra framework, and it works on structured programs with conditional statements and nested loops but without arbitrary goto statements.The distinctive features of the solution are the accuracy in keeping communication set information, support for general alignments and distributions including block-cyclic distribu-tions, and the ability to simulate some of the previous approaches with suitable modifications. We also show how optimizations such as message vectorization, message coalescing, and redundancy elimination are supported by our framework. Experimental results on several benchmarks show that our technique is effective in reducing the number of messages (anaverage of 32% reduction), the volume of the data communicated (an average of 37%reduction), and the execution time (an average of 26% reduction).

Keywords

This publication has 28 references indexed in Scilit:

Efficient Computation of Address Sequences in Data Parallel Programs Using Closed Forms for Basis Vectors
Journal of Parallel and Distributed Computing, 1996
Compiling Array Expressions for Efficient Execution on Distributed-Memory Machines
Journal of Parallel and Distributed Computing, 1996
The Paradigm compiler for distributed-memory multicomputers
Computer, 1995
Generating Local Addresses and Communication Sets for Data-Parallel Programs
Journal of Parallel and Distributed Computing, 1995
Compiling Fortran D for MIMD distributed-memory machines
Communications of the ACM, 1992
A practical algorithm for exact array dependence analysis
Communications of the ACM, 1992
Demonstration of automatic data partitioning techniques for parallelizing compilers on multicomputers
IEEE Transactions on Parallel and Distributed Systems, 1992
Updating distributed variables in local computations
Concurrency: Practice and Experience, 1990
Structured dataflow analysis for arrays and its use in an optimizing compiler
Software: Practice and Experience, 1990
A program data flow analysis procedure
Communications of the ACM, 1976