Very Long Instruction Word architectures and the ELI-512

13 June 1983

journal article
conference paper
Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News

Vol. 11 (3) , 140-150
https://doi.org/10.1145/1067651.801649

Abstract

By compiling ordinary scientific applications programs with a radical technique called trace scheduling, we are generating code for a parallel machine that will run these programs faster than an equivalent sequential machine—we expect 10 to 30 times faster. Trace scheduling generates code for machines called Very Long Instruction Word architectures. In Very Long Instruction Word machines, many statically scheduled, tightly coupled, fine-grained operations execute in parallel within a single instruction stream. VLIWs are more parallel extensions of several current architectures. These current architectures have never cracked a fundamental barrier. The speedup they get from parallelism is never more than a factor of 2 to 3. Not that we couldn't build more parallel machines of this type; but until trace scheduling we didn't know how to generate code for them. Trace scheduling finds sufficient parallelism in ordinary code to justify thinking about a highly parallel VLIW. At Yale we are actually building one. Our machine, the ELI-512, has a horizontal instruction word of over 500 bits and will do 10 to 30 RISC-level operations per cycle [Patterson 82]. ELI stands for Enormously Longword Instructions; 512 is the size of the instruction word we hope to achieve. (The current design has a 1200-bit instruction word.) Once it became clear that we could actually compile code for a VLIW machine, some new questions appeared, and answers are presented in this paper. How do we put enough tests in each cycle without making the machine too big? How do we put enough memory references in each cycle without making the machine too slow?

Keywords

This publication has 5 references indexed in Scilit:

High-Speed Multiprocessors and Compilation Techniques
IEEE Transactions on Computers, 1980
The Organization of Microprogram Stores
ACM Computing Surveys, 1979
Percolation of Code to Enhance Parallel Dispatching and Execution
IEEE Transactions on Computers, 1972
The Inhibition of Potential Parallelism by Conditional Jumps
IEEE Transactions on Computers, 1972
Detection and Parallel Execution of Independent Instructions
IEEE Transactions on Computers, 1970