Adaptive Cache Compression for High-Performance Processors
- 2 March 2004
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News
- Vol. 32 (2) , 212
- https://doi.org/10.1145/1028176.1006719
Abstract
Modern processors use two or more levels ofcache memories to bridge the rising disparity betweenprocessor and memory speeds. Compression canimprove cache performance by increasing effectivecache capacity and eliminating misses. However,decompressing cache lines also increases cache accesslatency, potentially degrading performance.In this paper, we develop an adaptive policy thatdynamically adapts to the costs and benefits of cachecompression. We propose a two-level cache hierarchywhere the L1 cache holds uncompressed data and the L2cache dynamically selects between compressed anduncompressed storage. The L2 cache is 8-way set-associativewith LRU replacement, where each set can storeup to eight compressed lines but has space for only fouruncompressed lines. On each L2 reference, the LRUstack depth and compressed size determine whethercompression (could have) eliminated a miss or incurs anunnecessary decompression overhead. Based on thisoutcome, the adaptive policy updates a single globalsaturating counter, which predicts whether to allocatelines in compressed or uncompressed form.We evaluate adaptive cache compression usingfull-system simulation and a range of benchmarks. Weshow that compression can improve performance formemory-intensive commercial workloads by up to 17%.However, always using compression hurts performancefor low-miss-rate benchmarks-due to unnecessarydecompression overhead-degrading performance byup to 18%. By dynamically monitoring workload behavior,the adaptive policy achieves comparable benefitsfrom compression, while never degrading performanceby more than 0.4%.Keywords
This publication has 18 references indexed in Scilit:
- Simulating a $2M commercial server on a $2K PCComputer, 2003
- Frequent value locality and its applicationsACM Transactions on Embedded Computing Systems, 2002
- Automatically characterizing large scale program behaviorPublished by Association for Computing Machinery (ACM) ,2002
- Simics: A full system simulation platformComputer, 2002
- Effective algorithms for cache-level compressionPublished by Association for Computing Machinery (ACM) ,2001
- Frequent value compression in data cachesPublished by Association for Computing Machinery (ACM) ,2000
- An on-chip cache compression technique to reduce decompression overhead and design complexityJournal of Systems Architecture, 2000
- The Alpha 21264 microprocessorIEEE Micro, 1999
- Generating representative Web workloads for network and server performance evaluationPublished by Association for Computing Machinery (ACM) ,1998
- Decoupled sectored cachesIEEE Transactions on Computers, 1997