Parallelization of loops with exits on pipelined architectures

4 December 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 200-212
https://doi.org/10.1109/superc.1990.130021

Abstract

Modulo scheduling theory can be applied successfully to overlap Fortran DO loops on pipelined computers issuing multiple operations per cycle both with and without special loop architectural support. It is shown that a broader class of loops-repeat-until, while, and loops with more than one exit-where the trip count is not known beforehand, can also be overlapped efficiently on multiple issue pipelined machines. Special features that are required in the architecture as well as compiler representations for accelerating these loop constructions are discussed. The approach uses hardware architectural support, program transformation techniques, performance bounds calculations, and scheduling heuristics. Performance results are presented for a few select examples. A prototype scheduler is currently under construction for the Cydra 5 directed dataflow computer Author(s) Tirumalai, P. Hewlett-Packard Lab., Palo Alto, CA, USA Lee, M. ; Schlansker, M.

Keywords

This publication has 12 references indexed in Scilit:

The Cydra 5 computer system architecture
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
The Cydra 5 departmental supercomputer: design philosophies, decisions, and trade-offs
Computer, 1989
Software pipelining: an effective scheduling technique for VLIW machines
Published by Association for Computing Machinery (ACM) ,1988
Cydra 5 directed dataflow architecture
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1988
A VLIW architecture for a trace scheduling compiler
ACM SIGARCH Computer Architecture News, 1987
A Fortran compiler for the FPS-164 scientific computer
Published by Association for Computing Machinery (ACM) ,1984
Conversion of control dependence to data dependence
Published by Association for Computing Machinery (ACM) ,1983
Very Long Instruction Word architectures and the ELI-512
Published by Association for Computing Machinery (ACM) ,1983
A composite hoisting-strength reduction transformation for global program optimization part I
International Journal of Computer Mathematics, 1982
An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family
Computer, 1981