Squeezing more CPU performance out of a Cray-2 by vector block scheduling

6 January 2003

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 237-245
https://doi.org/10.1109/superc.1988.44659

Abstract

Compile-time scheduling of vector activities on the Cray 2 is studied using a simplified model of the vector instruction stream. An approach based on experience with an array-processor microde scheduling by the authors is shown to be practical. It calls for a pass of loop scheduling followed by a pass of resource allocation. Actual benchmarks of the resulting code are shown, exhibiting speedups as large as 50% over the current CFT77 compiler. The results also give a novel perspective on vector chaining vs. nonchaining processor architectures.<>

Keywords

This publication has 5 references indexed in Scilit:

Squeezing more CPU performance out of a Cray-2 by vector block scheduling
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Instruction Issue Logic in Pipelined Supercomputers
IEEE Transactions on Computers, 1984
Parallel processing
Published by Association for Computing Machinery (ACM) ,1984
Register allocation & spilling via graph coloring
Published by Association for Computing Machinery (ACM) ,1982
Dependence graphs and compiler optimizations
Published by Association for Computing Machinery (ACM) ,1981