Randomized cache placement for eliminating conflicts
- 1 February 1999
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Computers
- Vol. 48 (2) , 185-192
- https://doi.org/10.1109/12.752660
Abstract
Applications with regular patterns of memory access can experience high levels of cache conflict misses. In shared-memory multiprocessors conflict misses can be increased significantly by the data transpositions required for parallelization. Techniques such as blocking which are introduced within a single thread to improve locality, can result in yet more conflict misses. The tension between minimizing cache conflicts and the other transformations needed for efficient parallelization leads to complex optimization problems for parallelizing compilers. This paper shows how the introduction of a pseudorandom element into the cache index function can effectively eliminate repetitive conflict misses and produce a cache where miss ratio depends solely on working set behavior. We examine the impact of pseudorandom cache indexing on processor cycle times and present practical solutions to some of the major implementation issues for this type of cache. Our conclusions are supported by simulations of a superscalar out-of-order processor executing the SPEC95 benchmarks, as well as from cache simulations of individual loop kernels to illustrate specific effects. We present measurements of instructions committed per cycle (IPC) when comparing the performance of different cache architectures on whole-program benchmarks such as the SPEC95 suite.Keywords
This publication has 22 references indexed in Scilit:
- Speculative execution via address prediction and data prefetchingPublished by Association for Computing Machinery (ACM) ,1997
- Eliminating cache conflict misses through XOR-based placement functionsPublished by Association for Computing Machinery (ACM) ,1997
- The case for a single-chip multiprocessorPublished by Association for Computing Machinery (ACM) ,1996
- ARB: a hardware mechanism for dynamic reordering of memory referencesIEEE Transactions on Computers, 1996
- Streamlining data cache access with fast address calculationPublished by Association for Computing Machinery (ACM) ,1995
- Avoiding conflict misses dynamically in large direct-mapped cachesPublished by Association for Computing Machinery (ACM) ,1994
- ATOMPublished by Association for Computing Machinery (ACM) ,1994
- Skewed-associative cachesPublished by Springer Nature ,1993
- The cache performance and optimizations of blocked algorithmsPublished by Association for Computing Machinery (ACM) ,1991
- Cache MemoriesACM Computing Surveys, 1982