Parallelizing nonnumerical code with selective scheduling and software pipelining

Open Access

1 November 1997

journal article
research article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Programming Languages and Systems

Vol. 19 (6) , 853-898
https://doi.org/10.1145/267959.269966

Abstract

Instruction-level parallelism (ILP) in nonnumerical code is regarded as scarce and hard to exploit due to its irregularity. In this article, we introduce a new code-scheduling technique for irregular ILP called “selective scheduling” which can be used as a component for superscalar and VLIW compilers. Selective scheduling can compute a wide set of independent operations across all execution paths based on renaming and forward-substitution and can compute available operations across loop iterations if combined with software pipelining. This scheduling approach has better heuristics for determining the usefulness of moving one operation versus moving another and can successfully find useful code motions without resorting to branch profiling. The compile-time overhead of selective scheduling is low due to its incremental computation technique and its controlled code duplication. We parallelized the SPEC integer benchmarks and five AIX utilities without using branch probabilities. The experiments indicate that a fivefold speedup is achievable on realistic resources with a reasonable overhead in compilation time and code expansion and that a solid speedup increase is also obtainable on machines with fewer resources. These results improve previously known characteristics of irregular ILP.

Keywords

This publication has 22 references indexed in Scilit:

Increasing cache bandwidth using multiport caches for exploiting ILP in non-numerical code
IEE Proceedings - Computers and Digital Techniques, 1997
Generalized multiway branch unit for VLIW microprocessors
IEEE Transactions on Parallel and Distributed Systems, 1995
Avoidance and suppression of compensation code in a trace scheduling compiler
ACM Transactions on Programming Languages and Systems, 1994
Making compaction-based parallelization affordable
IEEE Transactions on Parallel and Distributed Systems, 1993
The superblock: An effective technique for VLIW and superscalar compilation
The Journal of Supercomputing, 1993
Compiling for the Cydra
The Journal of Supercomputing, 1993
Instruction-level parallel processing: History, overview, and perspective
The Journal of Supercomputing, 1993
Alpha AXP architecture
Communications of the ACM, 1993
A development environment for horizontal microcode
IEEE Transactions on Software Engineering, 1988
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing
ACM SIGMICRO Newsletter, 1981