Extraction of massive instruction level parallelism

1 June 1993

journal article
Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News

Vol. 21 (3) , 5-12
https://doi.org/10.1145/152835.152836

Abstract

Our goal is to dramatically increase the performance of uniprocessors through the exploitation of instruction level parallelism, i.e. that parallelism which exists amongst the machine instructions of a program. Speculative execution may help a lot, but, it is argued, both branch prediction and eager execution are insufficient to achieve performances in speedup factors in the tens (with respect to sequential execution), with reasonable hardware costs.A new form of code execution, Disjoint Eager Execution (DEE) , is proposed which uses less hardware than pure eager execution, and has more performance than pure branch prediction; DEE is a continuum between branch prediction and eager execution. DEE is shown to be optimal, when processing resources are constrained.Branches are predicted in DEE, but the predictions should be made in parallel in order to obtain high performance. This is not allowed, however, by the use of the standard insrtruction stream model, the dynamic model (the order is as indicated by the contents of the Program Counter).The use of the static insruction stream is proposed instead. The static instruction stream oreder is the same as the order of the code in memory, and is independent of the execution of branches. It allows reduced branch dependencies, as well.It is argued that a new version, Levo, of an old machine model, CONDEL-2, will be able to attain massive Instruction Level Parallelsim.

Keywords

This publication has 17 references indexed in Scilit:

Concurrency extraction via hardware methods executing the static instruction stream
IEEE Transactions on Computers, 1992
Dynamic dependency analysis of ordinary programs
Published by Association for Computing Machinery (ACM) ,1992
Limits of control flow on parallelism
Published by Association for Computing Machinery (ACM) ,1992
Requirements for optimal execution of loops with tests
IEEE Transactions on Parallel and Distributed Systems, 1992
A theory of reduced and minimal procedural dependencies
IEEE Transactions on Computers, 1991
The evolution of instruction sequencing
Computer, 1991
Measuring parallelism in computation-intensive scientific/engineering applications
IEEE Transactions on Computers, 1988
A compilation technique for software pipelining of loops with conditional jumps
Published by Association for Computing Machinery (ACM) ,1987
HPS, a new microarchitecture: rationale and introduction
Published by Association for Computing Machinery (ACM) ,1985
An Efficient Algorithm for Exploiting Multiple Arithmetic Units
IBM Journal of Research and Development, 1967