Towards a theory of cache-efficient algorithms
- 1 November 2002
- journal article
- Published by Association for Computing Machinery (ACM) in Journal of the ACM
- Vol. 49 (6) , 828-858
- https://doi.org/10.1145/602220.602225
Abstract
We present a model that enables us to analyze the running time of an algorithm on a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our cache model, an extension of Aggarwal and Vitter's I/O model, enables us to establish useful relationships between the cache complexity and the I/O complexity of computations. As a corollary, we obtain cache-efficient algorithms in the single-level cache model for fundamental problems like sorting, FFT, and an important subclass of permutations. We also analyze the average-case cache behavior of mergesort, show that ignoring associativity concerns could lead to inferior performance, and present supporting experimental evidence.We further extend our model to multiple levels of cache with limited associativity and present optimal algorithms for matrix transpose and sorting. Our techniques may be used for systematic exploitation of the memory hierarchy starting from the algorithm design stage, and for dealing with the hitherto unresolved problem of limited associativity.Keywords
All Related Versions
This publication has 14 references indexed in Scilit:
- Asymptotically Tight Bounds for Performing BMMC Permutations on Parallel Disk SystemsSIAM Journal on Computing, 1998
- Simple randomized mergesort on parallel disksParallel Computing, 1997
- Influence of cross-interferences on blocked loopsACM Transactions on Programming Languages and Systems, 1995
- Cache profiling and the SPEC benchmarks: a case studyComputer, 1994
- The uniform memory hierarchy model of computationAlgorithmica, 1994
- Performance-Directed Cache DesignPublished by Elsevier ,1990
- Evaluating associativity in CPU cachesIEEE Transactions on Computers, 1989
- An analytical cache modelACM Transactions on Computer Systems, 1989
- The input/output complexity of sorting and related problemsCommunications of the ACM, 1988
- Amortized efficiency of list update and paging rulesCommunications of the ACM, 1985