Parallelization of loops with exits on pipelined architectures
- 4 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Modulo scheduling theory can be applied successfully to overlap Fortran DO loops on pipelined computers issuing multiple operations per cycle both with and without special loop architectural support. It is shown that a broader class of loops-repeat-until, while, and loops with more than one exit-where the trip count is not known beforehand, can also be overlapped efficiently on multiple issue pipelined machines. Special features that are required in the architecture as well as compiler representations for accelerating these loop constructions are discussed. The approach uses hardware architectural support, program transformation techniques, performance bounds calculations, and scheduling heuristics. Performance results are presented for a few select examples. A prototype scheduler is currently under construction for the Cydra 5 directed dataflow computer Author(s) Tirumalai, P. Hewlett-Packard Lab., Palo Alto, CA, USA Lee, M. ; Schlansker, M.Keywords
This publication has 12 references indexed in Scilit:
- The Cydra 5 computer system architecturePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- The Cydra 5 departmental supercomputer: design philosophies, decisions, and trade-offsComputer, 1989
- Software pipelining: an effective scheduling technique for VLIW machinesPublished by Association for Computing Machinery (ACM) ,1988
- Cydra 5 directed dataflow architecturePublished by Institute of Electrical and Electronics Engineers (IEEE) ,1988
- A VLIW architecture for a trace scheduling compilerACM SIGARCH Computer Architecture News, 1987
- A Fortran compiler for the FPS-164 scientific computerPublished by Association for Computing Machinery (ACM) ,1984
- Conversion of control dependence to data dependencePublished by Association for Computing Machinery (ACM) ,1983
- Very Long Instruction Word architectures and the ELI-512Published by Association for Computing Machinery (ACM) ,1983
- A composite hoisting-strength reduction transformation for global program optimization part IInternational Journal of Computer Mathematics, 1982
- An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 FamilyComputer, 1981