Out-of-order vector architectures
- 1 January 1997
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Register renaming and out-of-order instruction issue are now commonly used in superscalar processors. These techniques can also be used to significant advantage in vector processors, as this paper shows. Performance is improved and available memory bandwidth is used more effectively. Using a trace driven simulation we compare a conventional vector implementation, based on the Convex C3400, with an out-of-order, register renaming, vector implementation. When the number of physical registers is above 12, out-of-order execution coupled with register renaming provides a speedup of 1.24-1.72 for realistic memory latencies. Out-of-order techniques also tolerate main memory latencies of 100 cycles with a performance degradation less than 6%. The mechanisms used for register renaming and out-of-order issue can be used to support precise interrupts-generally a difficult problem in vector machines. When precise interrupts are implemented, there is typically less than a 10% degradation in performance. A new technique based on register renaming is targeted at dynamically eliminating spill code; this technique is shown to provide an extra speedup ranging between 1.10 and 1.20 while reducing total memory traffic by an average of 15-20%.Peer ReviewedPostprint (published versionKeywords
This publication has 11 references indexed in Scilit:
- CRegs: a new kind of memory for referencing arrays and pointersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Architecture of the VPP500 parallel supercomputerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The Mips R10000 superscalar microprocessorIEEE Micro, 1996
- Quantitative analysis of vector codePublished by Institute of Electrical and Electronics Engineers (IEEE) ,1995
- Relationship between average and real memory behaviorThe Journal of Supercomputing, 1994
- Distributed storage control unit for the Hitachi S-3800 multivector supercomputerPublished by Association for Computing Machinery (ACM) ,1994
- Explaining the Gap between Theoretical Peak Performance and Real Performance for Supercomputer ArchitecturesScientific Programming, 1994
- The performance impact of vector processor cashesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- THE PARALLEL PROCESSING FEATURE OF THE NEC SX-3 SUPERCOMPUTER SYSTEMInternational Journal of High Speed Computing, 1991
- The CRAY-1 computer systemCommunications of the ACM, 1978