Access normalization

1 November 1993

journal article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Computer Systems

Vol. 11 (4) , 353-375
https://doi.org/10.1145/161541.159766

Abstract

In scalable parallel machines, processors can make local memory accesses much faster than they can make remote memory accesses. Additionally, when a number of remote accesses must be made, it is usually more efficient to use block transfers of data rather than to use many small messages. To run well on such machines, software must exploit these features. We believe it is too onerous for a programmer to do this by hand, so we have been exploring the use of restructuring compiler technology for this purpose. In this article, we start with a language like HPF-Fortran with user-specified data distribution and develop a systematic loop transformation strategy called access normalization that restructures loop nests to exploit locality and block transfers. We demonstrate the power of our techniques using routines from the BLAS (Basic Linear Algebra Subprograms) library. An important feature of our approach is that we model loop transformation using invertible matrices and integer lattice theory.

Keywords

This publication has 13 references indexed in Scilit:

A loop transformation theory and an algorithm to maximize parallelism
IEEE Transactions on Parallel and Distributed Systems, 1991
Limits on interconnection network performance
IEEE Transactions on Parallel and Distributed Systems, 1991
Compile-time techniques for data distribution in distributed memory machines
IEEE Transactions on Parallel and Distributed Systems, 1991
Compiling global name-space parallel loops for distributed execution
IEEE Transactions on Parallel and Distributed Systems, 1991
Data optimization: Allocation of arrays to reduce communication on SIMD machines
Journal of Parallel and Distributed Computing, 1990
Strategies for cache and local memory management by global program transformation
Journal of Parallel and Distributed Computing, 1988
Dependence
Published by Springer Nature ,1988
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems, 1987
Advanced compiler optimizations for supercomputers
Communications of the ACM, 1986
The parallel execution of DO loops
Communications of the ACM, 1974