Superscalar execution with dynamic data forwarding
- 27 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 130-135
- https://doi.org/10.1109/pact.1998.727183
Abstract
We empirically demonstrate that in order to take advantage of increasing issue widths, superscalar processors require quadratically growing instruction window sizes. Since conventional central window design aims to provide full data fan-out to all the instructions which are in the window, designing large instruction windows using conventional techniques is not feasible. We show that full data fan-out is not necessary for achieving high performance when a novel approach is used to distribute the values. We use direct matching using a small on chip memory called the wait memory to implement the instruction window and bring in a small subset of instructions which are likely to become ready into a match unit where instruction selection and operand matching tasks are performed. We show that the match unit needs to grow only linearly with the issue width. We use SPEC95 benchmarks to demonstrate that at a given instruction window size our algorithm provides over 90 percent of the IPC that can be obtained by a central window implementation that provides full data fan-out.Keywords
This publication has 7 references indexed in Scilit:
- Increasing the instruction fetch rate via block-structured instruction set architecturesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Trace cache: a low latency approach to high bandwidth instruction fetchingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Automatic generation of microarchitecture simulatorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- One billion transistors, one uniprocessor, one chipComputer, 1997
- Superspeculative microarchitecture for beyond AD 2000Computer, 1997
- Instruction Issue Logic in Pipelined SupercomputersIEEE Transactions on Computers, 1984
- An Efficient Algorithm for Exploiting Multiple Arithmetic UnitsIBM Journal of Research and Development, 1967