Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations
- 15 November 2003
- proceedings article
- Published by Association for Computing Machinery (ACM)
Abstract
The growing gap between sustained and peak performance for scientific applications is a well-known problem in high end computing. The recent development of parallel vector systems offers the potential to bridge this gap for many computational science codes and deliver a substantial increase in comput-ing capabilities. This paper examines the intranode performance of the NEC SX-6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of scientific computing areas. First, we present the performance of a microbenchmark suite that examines low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks. Finally, we evaluate the performance of several scientific computing codes. Results demonstrate that the SX-6 achieves high performance on a large fraction of our applications and often significantly outperforms the cache-based architectures. However, certain applications are not easily amenable to vectorization and would require extensive algorithm and implementation reengineering to utilize the SX-6 effectively.Keywords
This publication has 6 references indexed in Scilit:
- Performance enhancement strategies for multi-block overset grid CFD applicationsParallel Computing, 2003
- Size Scaling of Turbulent Transport in Magnetically Confined PlasmasPhysical Review Letters, 2002
- Solving Einstein's equations on supercomputersComputer, 1999
- Iterative minimization techniques forab initiototal-energy calculations: molecular dynamics and conjugate gradientsReviews of Modern Physics, 1992
- Numerically generated black-hole spacetimes: Interaction with gravitational wavesPhysical Review D, 1992
- Gyrokinetic particle simulation modelJournal of Computational Physics, 1987