Data distribution support on distributed shared memory multiprocessors
- 1 May 1997
- proceedings article
- Published by Association for Computing Machinery (ACM)
- Vol. 32 (5) , 334-345
- https://doi.org/10.1145/258915.258945
Abstract
Cache-coherent multiprocessors with distributed shared memory are becoming increasingly popular for parallel computing. However, obtaining high performance on these machines mquires that an application execute with good data locality. In addition to making efiective use of caches, it is often necessary to distribute data structures across the local memories of the processing nodes, thereby reducing the latency of cache misses.We have designed a set of abstractions for performing data distribution in the context of explicitly parallel programs and implemented them within the SGI MIPSpro compiler system. Our system incorporates many unique features to enhance both programmability and performance. We address the former by providing a very simple programmming model with extensive support for error detection. Regarding performance, we carefully design the user abstractions with the underlying compiler optimizations in mind, we incorporate several optimization techniques to generate efficient code for accessing distributed data, and we provide a tight integration of these techniques with other optimizations within the compiler Our initial experience suggests that the directives are easy to use and can yield substantial performance gains, in some cases by as much as a factor of 3 over the same codes without distribution.Keywords
This publication has 9 references indexed in Scilit:
- The SGI OriginPublished by Association for Computing Machinery (ACM) ,1997
- Performance analysis using the MIPS R10000 performance countersPublished by Association for Computing Machinery (ACM) ,1996
- Operating system support for improving data locality on CC-NUMA compute serversPublished by Association for Computing Machinery (ACM) ,1996
- Data and computation transformations for multiprocessorsPublished by Association for Computing Machinery (ACM) ,1995
- Automatic data layout for high performance FortranPublished by Association for Computing Machinery (ACM) ,1995
- Global optimizations for parallelism and locality on scalable parallel machinesPublished by Association for Computing Machinery (ACM) ,1993
- High performance FortranIEEE Parallel & Distributed Technology: Systems & Applications, 1993
- Demonstration of automatic data partitioning techniques for parallelizing compilers on multicomputersIEEE Transactions on Parallel and Distributed Systems, 1992
- Programming in Vienna FortranScientific Programming, 1992