Squeezing more CPU performance out of a Cray-2 by vector block scheduling
- 6 January 2003
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Compile-time scheduling of vector activities on the Cray 2 is studied using a simplified model of the vector instruction stream. An approach based on experience with an array-processor microde scheduling by the authors is shown to be practical. It calls for a pass of loop scheduling followed by a pass of resource allocation. Actual benchmarks of the resulting code are shown, exhibiting speedups as large as 50% over the current CFT77 compiler. The results also give a novel perspective on vector chaining vs. nonchaining processor architectures.<>Keywords
This publication has 5 references indexed in Scilit:
- Squeezing more CPU performance out of a Cray-2 by vector block schedulingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Instruction Issue Logic in Pipelined SupercomputersIEEE Transactions on Computers, 1984
- Parallel processingPublished by Association for Computing Machinery (ACM) ,1984
- Register allocation & spilling via graph coloringPublished by Association for Computing Machinery (ACM) ,1982
- Dependence graphs and compiler optimizationsPublished by Association for Computing Machinery (ACM) ,1981