Compiling for distributed memory architectures

1 March 1994

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Parallel and Distributed Systems

Vol. 5 (3) , 281-298
https://doi.org/10.1109/71.277789

Abstract

The lack of high-level languages and good compilers for parallel machines hinders their widespread acceptance and use. Programmers must address issues such as process decomposition, synchronization, and load balancing. We have developed a parallelizing compiler that, given a sequential program and a memory layout of its data, performs process decomposition while balancing parallelism against locality of reference. A process decomposition is obtained by specializing the program for each processor to the data that resides on that processor. If this analysis fails, the compiler falls back to a simple but inefficient scheme called run-time resolution. Each process's role in the computation is determined by examining the data required for execution at run-time. Thus, our approach to process decomposition is data-driven rather than program-driven. We discuss several message optimizations that address the issues of overhead and synchronization in message transmission. Accumulation reorganizes the computation of a commutative and associative operator to reduce message traffic. Pipelining sends a value as close to its computation as possible to increase parallelism. Vectorization of messages combines messages with the same source and the same destination to reduce overhead. Our results from experiments in parallelizing SIMPLE, a large hydrodynamics benchmark, for the Intel iPSC/2, show a speedup within 60% to 70% of handwritten code.

Keywords

This publication has 19 references indexed in Scilit:

Programming SIMPLE for parallel portability
Published by Springer Nature ,2006
An Interactive Environment for Data Partitioning and Distribution
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Compiler optimizations for Fortran D on MIMD distributed-memory machines
Published by Association for Computing Machinery (ACM) ,1991
Updating distributed variables in local computations
Concurrency: Practice and Experience, 1990
Pandore
Published by Association for Computing Machinery (ACM) ,1990
Benchmarking the iPSC/2 hypercube multiprocessor
Concurrency: Practice and Experience, 1989
PARAFRASE-2: AN ENVIRONMENT FOR PARALLELIZING, PARTITIONING, SYNCHRONIZING, AND SCHEDULING PROGRAMS ON MULTIPROCESSORS
International Journal of High Speed Computing, 1989
An efficient method of computing static single assignment form
Published by Association for Computing Machinery (ACM) ,1989
Compiling programs for distributed-memory multiprocessors
The Journal of Supercomputing, 1988
An overview of the PTRAN analysis system for multiprocessing
Journal of Parallel and Distributed Computing, 1988