Streamlining inter-operation memory communication via data dependence prediction
- 23 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10724451,p. 235-245
- https://doi.org/10.1109/micro.1997.645814
Abstract
We revisit memory hierarchy design viewing memory as an inter-operation communication agent. This perspective leads to the development of novel methods of performing inter-operation memory communication. We use data dependence prediction to identify and link dependent loads and stores so that they can communicate speculatively without incurring the overhead of address calculation, disambiguation and data cache access. We also use data dependence prediction to convert, DEF-store-load-USE chains within the instruction window into DEF-USE chains prior to address calculation and disambiguation. We use true and output data dependence status prediction to introduce and manage a small storage structure called the transient value cache (TVC). The TVC captures memory values that are short-lived. It also captures recently stored values that are likely to be accessed soon. Accesses that are serviced by the TVC do not have to be serviced by other parts of the memory hierarchy, e.g., the data cache. The first two techniques are aimed at reducing the effective communication latency whereas the last technique is aimed at reducing data cache bandwidth requirements. Experimental analysis of the proposed techniques shows that: the proposed speculative communication methods correctly handle a large fraction of memory dependences; and a large number of the loads and stores do not have to ever reach the data cache when the TVC is in place.Keywords
This publication has 12 references indexed in Scilit:
- Analysis of memory referencing behavior for design of local memoriesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- The intrinsic bandwidth requirements of ordinary programsPublished by Association for Computing Machinery (ACM) ,1996
- Increasing cache port efficiency for dynamic superscalar microprocessorsPublished by Association for Computing Machinery (ACM) ,1996
- ARB: a hardware mechanism for dynamic reordering of memory referencesIEEE Transactions on Computers, 1996
- A modified approach to data cache managementPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1995
- Zero-cycle loads: microarchitecture support for reducing load latencyPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1995
- A data cache with multiple caching strategies tuned to different types of localityPublished by Association for Computing Machinery (ACM) ,1995
- Streamlining data cache access with fast address calculationPublished by Association for Computing Machinery (ACM) ,1995
- A load-instruction unit for pipelined processorsIBM Journal of Research and Development, 1993
- An effective on-chip preloading scheme to reduce data access penaltyPublished by Association for Computing Machinery (ACM) ,1991