Analyzing data reuse for cache reconfiguration
- 1 November 2005
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Embedded Computing Systems
- Vol. 4 (4) , 851-876
- https://doi.org/10.1145/1113830.1113836
Abstract
Classical compiler optimizations assume a fixed cache architecture and modify the program to take best advantage of it. In some cases, this may not be the best strategy because each nest might work best with a different cache configuration and transforming a nest for a given fixed cache configuration may not be possible due to data and control dependences. Working with a fixed cache configuration can also increase energy consumption in loops where the best required configuration is smaller than the default (fixed) one. In this paper, we take an alternate approach and modify the cache configuration for each nest, depending on the access pattern exhibited by the nest. We call this technique compiler-directed cache polymorphism (CDCP). More specifically, in this paper, we make the following contributions. First, we present an approach for analyzing data reuse properties of loop nests. Second, we give algorithms to simulate the footprints of array references in their reuse space. Third, based on our reuse analysis, we present an optimization algorithm to compute the cache configurations for each loop nest. Our experimental results show that CDCP is very effective in finding the near-optimal data cache configurations for different nests in array-intensive applications.Keywords
This publication has 17 references indexed in Scilit:
- Morphable Cache ArchitecturesPublished by Association for Computing Machinery (ACM) ,2001
- Transforming loops to recursion for multi-level memory hierarchiesPublished by Association for Computing Machinery (ACM) ,2000
- Energy-efficient design of battery-powered embedded systemsPublished by Association for Computing Machinery (ACM) ,1999
- Eliminating conflict misses for high performance architecturesPublished by Association for Computing Machinery (ACM) ,1998
- Instruction level power analysis and optimization of softwareJournal of Signal Processing Systems, 1996
- Improving data locality with loop transformationsACM Transactions on Programming Languages and Systems, 1996
- Compiler transformations for high-performance computingACM Computing Surveys, 1994
- Improving the cache locality of memory allocationPublished by Association for Computing Machinery (ACM) ,1993
- A data locality optimizing algorithmPublished by Association for Computing Machinery (ACM) ,1991
- Strategies for cache and local memory management by global program transformationJournal of Parallel and Distributed Computing, 1988