Reducing the performance impact of instruction cache misses by writing instructions into the reservation stations out-of-order

22 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 34-43
https://doi.org/10.1109/micro.1997.645795

Abstract

In conventional processors, each instruction cache fetch brings in a group of instructions. Upon encountering an instruction cache miss, the processor will wait until the instruction cache miss is serviced before continuing to fetch any new instructions. The paper presents a new technique, called out-of-order issue, which allows the processor to temporarily ignore the instructions associated with the instruction cache miss. The processor attempts to fetch the instructions that follow the group of instructions associated with the miss. These instructions are then decoded and written into the processor's reservation stations. Later, after the instruction cache miss has been serviced, the instructions associated with the miss are decoded and written into the reservation stations. (We use the term issue to indicate the act of writing instructions into the reservation stations. With this technique, instructions are not written into the reservation stations in program order. Hence, the term out-of-order issue.) We introduce the concept of out-of-order issue, describe its implementation, and present some initial data showing the performance gains possible with out-of-order issue.

Keywords

This publication has 19 references indexed in Scilit:

A fill-unit approach to multiple instruction issue
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Trace cache: a low latency approach to high bandwidth instruction fetching
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
The difference-bit cache
Published by Association for Computing Machinery (ACM) ,1996
Enhancing instruction scheduling with a block-structured ISA
International Journal of Parallel Programming, 1995
An efficient resource-constrained global scheduling technique for superscalar and VLIW processors
ACM SIGMICRO Newsletter, 1992
Available instruction-level parallelism for superscalar and superpipelined machines
Published by Association for Computing Machinery (ACM) ,1989
Program optimization for instruction caches
Published by Association for Computing Machinery (ACM) ,1989
Performance benefits of large execution atomic units in dynamically scheduled machines
Published by Association for Computing Machinery (ACM) ,1989
Cache Memories
ACM Computing Surveys, 1982