Static analysis of upper and lower bounds on dependences and parallelism
- 1 July 1994
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Programming Languages and Systems
- Vol. 16 (4) , 1248-1278
- https://doi.org/10.1145/183432.183525
Abstract
Existing compilers often fail to parallelize sequential code, even when a program can be manually transformed into parallel form by a sequence of well-understood transformations (as in the case for many of the Perfect Club Benchmark programs). These failures can occur for several reasons: the code transformations implemented in the compiler may not be sufficient to produce parallel code, the compiler may not find the proper sequence of transformations, or the compiler may not be able to prove that one of the necessary transformations is legal. When a compiler fails to extract sufficient parallelism from a program, the programmer may try to extract additional parallelism. Unfortunately, the programmer is typically left to search for parallelism without significant assistance. The compiler generally does not give feedback about which parts of the program might contain additional parallelism, or about the types of transformations that might be needed to realize this parallelism. Standard program transformations and dependence abstractions cannot be used to provide this feedback. In this paper, we propose a two-step approach to the search for parallelism in sequential programs. In the first step, we construct several sets of constraints that describe, for each statement, which iterations of that statement can be executed concurrently. By constructing constraints that correspond to different assumptions about which dependences might be eliminated through additional analysis, transformations, and user assertions, we can determine whether we can expose parallelism by eliminating dependences. In the second step of our search for parallelism, we examine these constraint sets to identify the kinds of transformations needed to exploit scalable parallelism. Our tests will identify conditional parallelism and parallelism that can be exposed by combinations of transformations that reorder the iteration space (such as loop interchange and loop peeling). This approach lets us distinguish inherently sequential code from code that contains unexploited parallelism. It also produces information about the kinds of transformations needed to parallelize the code, without worrying about the order of application of the transformations. Furthermore, when our dependence test is inexact we can identify which unresolved dependences inhibit parallelism by comparing the effects of assuming dependence or independence. We are currently exploring the use of this information in programmer-assisted parallelization.Keywords
This publication has 12 references indexed in Scilit:
- Toward a methodology of optimizing programs for high-performance computersPublished by Association for Computing Machinery (ACM) ,1993
- Loop-level parallelism in numeric and symbolic programsIEEE Transactions on Parallel and Distributed Systems, 1993
- A practical data flow framework for array reference analysis and its use in optimizationsPublished by Association for Computing Machinery (ACM) ,1993
- Definitions of dependence distanceACM Letters on Programming Languages and Systems, 1992
- A practical algorithm for exact array dependence analysisCommunications of the ACM, 1992
- Scanning polyhedra with DO loopsPublished by Association for Computing Machinery (ACM) ,1991
- Dataflow analysis of array and scalar referencesInternational Journal of Parallel Programming, 1991
- Structured dataflow analysis for arrays and its use in an optimizing compilerSoftware: Practice and Experience, 1990
- An introduction to a formal theory of dependence analysisThe Journal of Supercomputing, 1988
- Automatic translation of FORTRAN programs to vector formACM Transactions on Programming Languages and Systems, 1987