Compiling for distributed memory architectures
- 1 March 1994
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Parallel and Distributed Systems
- Vol. 5 (3) , 281-298
- https://doi.org/10.1109/71.277789
Abstract
The lack of high-level languages and good compilers for parallel machines hinders their widespread acceptance and use. Programmers must address issues such as process decomposition, synchronization, and load balancing. We have developed a parallelizing compiler that, given a sequential program and a memory layout of its data, performs process decomposition while balancing parallelism against locality of reference. A process decomposition is obtained by specializing the program for each processor to the data that resides on that processor. If this analysis fails, the compiler falls back to a simple but inefficient scheme called run-time resolution. Each process's role in the computation is determined by examining the data required for execution at run-time. Thus, our approach to process decomposition is data-driven rather than program-driven. We discuss several message optimizations that address the issues of overhead and synchronization in message transmission. Accumulation reorganizes the computation of a commutative and associative operator to reduce message traffic. Pipelining sends a value as close to its computation as possible to increase parallelism. Vectorization of messages combines messages with the same source and the same destination to reduce overhead. Our results from experiments in parallelizing SIMPLE, a large hydrodynamics benchmark, for the Intel iPSC/2, show a speedup within 60% to 70% of handwritten code.Keywords
This publication has 19 references indexed in Scilit:
- Programming SIMPLE for parallel portabilityPublished by Springer Nature ,2006
- An Interactive Environment for Data Partitioning and DistributionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Compiler optimizations for Fortran D on MIMD distributed-memory machinesPublished by Association for Computing Machinery (ACM) ,1991
- Updating distributed variables in local computationsConcurrency: Practice and Experience, 1990
- PandorePublished by Association for Computing Machinery (ACM) ,1990
- Benchmarking the iPSC/2 hypercube multiprocessorConcurrency: Practice and Experience, 1989
- PARAFRASE-2: AN ENVIRONMENT FOR PARALLELIZING, PARTITIONING, SYNCHRONIZING, AND SCHEDULING PROGRAMS ON MULTIPROCESSORSInternational Journal of High Speed Computing, 1989
- An efficient method of computing static single assignment formPublished by Association for Computing Machinery (ACM) ,1989
- Compiling programs for distributed-memory multiprocessorsThe Journal of Supercomputing, 1988
- An overview of the PTRAN analysis system for multiprocessingJournal of Parallel and Distributed Computing, 1988