Evaluating support for global address space languages on the Cray X1
- 26 June 2004
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 184-195
- https://doi.org/10.1145/1006209.1006236
Abstract
The Cray X1 was recently introduced as the first in a new line of parallel systems to combine high-bandwidth vector processing with an MPP system architecture. Alongside capabilities such as automatic fine-grained data parallelism through the use of vector instructions, the X1 offers hardware support for a transparent global-address space (GAS), which makes it an interesting target for GAS languages. In this paper, we describe our experience with developing a portable, open-source and high performance compiler for Unified Parallel C (UPC), a SPMD global-address space language extension of ISO C. As part of our implementation effort, we evaluate the X1's hardware support for GAS languages and provide empirical performance characterizations in the context of leveraging features such as vectorization and global pointers for the Berkeley UPC compiler. We discuss several difficulties encountered in the Cray C compiler which are likely to present challenges for many users, especially implementors of libraries and source-to-source translators. Finally, we analyze the performance of our compiler on some benchmark programs and show that, while there are some limitations of the current compilation approach, the Berkeley UPC compiler uses the X1 network more effectively than MPI or SHMEM, and generates serial code whose vectorizability is comparable to the original C code.Keywords
This publication has 10 references indexed in Scilit:
- Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific ComputationsPublished by Association for Computing Machinery (ACM) ,2003
- Early Evaluation of the Cray X1Published by Association for Computing Machinery (ACM) ,2003
- A performance analysis of the Berkeley UPC compilerPublished by Association for Computing Machinery (ACM) ,2003
- Communication Optimizations for Parallel C ProgramsJournal of Parallel and Distributed Computing, 1999
- LogGP: Incorporating Long Messages into the LogP Model for Parallel ComputationJournal of Parallel and Distributed Computing, 1997
- Analyses and Optimizations for Shared Address Space ProgramsJournal of Parallel and Distributed Computing, 1996
- Compiler-based prefetching for recursive data structuresPublished by Association for Computing Machinery (ACM) ,1996
- Synchronization and communication in the T3E multiprocessorPublished by Association for Computing Machinery (ACM) ,1996
- Global communication analysis and optimizationPublished by Association for Computing Machinery (ACM) ,1996
- Parallel programming in Split-CPublished by Association for Computing Machinery (ACM) ,1993