Optimal loop parallelization
- 1 June 1988
- proceedings article
- Published by Association for Computing Machinery (ACM)
- Vol. 23 (7) , 308-317
- https://doi.org/10.1145/53990.54021
Abstract
Parallelizing compilers promise to exploit the parallelism available in a given program, particularly parallelism that is too low-level or irregular to be expressed by hand in an algorithm. However, existing parallelization techniques do not handle loops in a satisfactory manner. Fine-grain (instruction level) parallelization, or compaction, captures irregular parallelism inside a loop body but does not exploit parallelism across loop iterations. Coarser methods, such as doacross [9], sacrifice irregular forms of parallelism in favor of pipelining iterations (software pipelining). Both of these approaches often yield suboptimal speedups even under the best conditions-when resources are plentiful and processors are synchronous. In this paper we present a new technique bridging the gap between fine-and coarse-grain loop parallelization, allowing the exploitation of parallelism inside and across loop iterations. Furthermore, we show that, given a loop and a set of dependencies between its statements, the execution schedule obtained by our transformation is time optimal: no transformation of the loop based on the given data-dependencies can yield a shorter running time for that loop.Keywords
This publication has 5 references indexed in Scilit:
- A development environment for horizontal microcodeIEEE Transactions on Software Engineering, 1988
- A compilation technique for software pipelining of loops with conditional jumpsPublished by Association for Computing Machinery (ACM) ,1987
- Automatic loop interchangePublished by Association for Computing Machinery (ACM) ,1984
- Conversion of control dependence to data dependencePublished by Association for Computing Machinery (ACM) ,1983
- Dependence graphs and compiler optimizationsPublished by Association for Computing Machinery (ACM) ,1981