Speculation techniques for improving load related instruction scheduling
- 1 May 1999
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News
- Vol. 27 (2) , 42-53
- https://doi.org/10.1145/307338.300983
Abstract
State of the art microprocessors achieve high performance by executing multiple instructions per cycle. In an out-of-order engine, the instruction scheduler is responsible for dispatching instructions to execution units based on dependencies, latencies, and resource availability. Most existing instruction schedulers are doing a less than optimal job of scheduling memory accesses and instructions dependent on them, for the following reasons:• Memory dependencies cannot be resolved prior to execution, so loads are not advanced ahead of preceding stores.• The dynamic latencies of load instructions are unknown, so scheduling dependent instructions is based on either optimistic load-use delay (may cause re-scheduling and re-execution) or pessimistic delay (creating unnecessary delays).• Memory pipelines are more expensive than other execution units, and as such, are a scarce resource. Currently, an increase in the memory execution bandwidth is usually achieved through multi-banked caches where bank conflicts limit efficiency.In this paper we present three techniques to address these scheduler limitations. One is to improve the scheduling of load instructions by using a simple memory disambiguation mechanism. The second is to improve the scheduling of load dependent instructions by employing a Data Cache Hit-Miss Predictor to predict the dynamic load latencies. And the third is to improve the efficiency of load scheduling in a multi-banked cache through Cache-Bank Prediction.Keywords
This publication has 7 references indexed in Scilit:
- Dynamic speculation and synchronization of data dependencesPublished by Association for Computing Machinery (ACM) ,1997
- Trading conflict and capacity aliasing in conditional branch predictorsPublished by Association for Computing Machinery (ACM) ,1997
- Increasing cache port efficiency for dynamic superscalar microprocessorsPublished by Association for Computing Machinery (ACM) ,1996
- ARB: a hardware mechanism for dynamic reordering of memory referencesIEEE Transactions on Computers, 1996
- Simultaneous multithreadingPublished by Association for Computing Machinery (ACM) ,1995
- Implementation trade-offs in using a restricted data flow architecture in a high performance RISC microprocessorPublished by Association for Computing Machinery (ACM) ,1995
- Dynamic memory disambiguation using the memory conflict bufferPublished by Association for Computing Machinery (ACM) ,1994