Evaluation of hardware-based stride and sequential prefetching in shared-memory multiprocessors

1 April 1996

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Parallel and Distributed Systems

Vol. 7 (4) , 385-398
https://doi.org/10.1109/71.494633

Abstract

We study the efficiency of previously proposed stride and sequential prefetching-two promising hardware-based prefetching schemes to reduce read-miss penalties in shared-memory multiprocessors. Although stride accesses dominate in four out of six of the applications we study, we find that sequential prefetching does as well as and in same cases even better than stride prefetching for five applications. This is because 1) most strides are shorter than the block size (we assume 32 byte blocks), which means that sequential prefetching is as effective for these stride accesses, and 2) sequential prefetching also exploits the locality of read misses with nonstride accesses. However, since stride prefetching in general results in fewer useless prefetches, it offers the extra advantage of consuming less memory-system bandwidth.

Keywords

This publication has 17 references indexed in Scilit:

The Cachemire Test Bench A Flexible And Effective Approach For Simulation Of Multiprocessors
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Combined performance gains of simple cache protocol extensions
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A performance study of software and hardware data prefetching schemes
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A Preliminary Evaluation of Cache-Miss-Initiated Prefetching Techniques in Scalable Multiprocessors
Published by Defense Technical Information Center (DTIC) ,1994
Prefetch unit for vector operations on scalar computers
ACM SIGARCH Computer Architecture News, 1992
SPLASH
ACM SIGARCH Computer Architecture News, 1992
An effective on-chip preloading scheme to reduce data access penalty
Published by Association for Computing Machinery (ACM) ,1991
A survey of cache coherence schemes for multiprocessors
Computer, 1990
A New Solution to Coherence Problems in Multicache Systems
IEEE Transactions on Computers, 1978