Evaluating stream buffers as a secondary cache replacement
- 17 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Today's commodity microprocessors require a low latency memory system to achieve high sustained performance. The conventional high-performance memory system provides fast data access via a large secondary cache. But large secondary caches can be expensive, particularly in large-scale parallel systems with many processors (and thus many caches). The authors evaluate a memory system design that can be both cost-effective as well as provide better performance, particularly for scientific workloads: a single level of (on-chip) cache backed up only by Jouppi's stream buffers and a main memory. This memory system requires very little hardware compared to a large secondary cache and doesn't require modifications to commodity processors. The authors use trace-driven simulation of fifteen scientific applications from the NAS and PERFECT suites in their evaluation. They present two techniques to enhance the effectiveness of Jouppi's original stream buffers: filtering schemes to reduce their memory bandwidth requirement and a scheme that enables stream buffers to prefetch data being accessed in large strides. The results show that, for the majority of the benchmarks, stream buffers can attain hit rates that are comparable to typical hit rates of secondary caches. Also, the authors find that as the data-set size of the scientific workload increases the performance of streams typically improves relative to secondary cache performance, showing that streams are more scalable to large data-set sizes.Keywords
This publication has 11 references indexed in Scilit:
- Stride Directed Prefetching In Scalar ProcessorsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Prefetching in supercomputer instruction cachesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Prefetch unit for vector operations on scalar computersACM SIGARCH Computer Architecture News, 1992
- Design and evaluation of a compiler algorithm for prefetchingPublished by Association for Computing Machinery (ACM) ,1992
- Data prefetching in multiprocessor vector cache memoriesPublished by Association for Computing Machinery (ACM) ,1991
- Software prefetchingPublished by Association for Computing Machinery (ACM) ,1991
- The Perfect Club Benchmarks: Effective Performance Evaluation of SupercomputersThe International Journal of Supercomputing Applications, 1989
- Cache operations by MRU changeIEEE Transactions on Computers, 1988
- Cache MemoriesACM Computing Surveys, 1982