A characterization of sharing in parallel programs and its application to coherency protocol evaluation
- 17 May 1988
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News
- Vol. 16 (2) , 373-382
- https://doi.org/10.1145/633625.52442
Abstract
In this paper we use trace-driven simulation to analyze the memory reference patterns of write shared data in several parallel applications. We first develop a characterization of write sharing (based on the notion of a write run), and then examine the traces, using metrics derived from the characterization. The results indicate that the amount of write sharing in all programs is small; and that it is characterized by short to medium sequences of per processor references, with little contention for either data or locks. We determine to what extent this analysis can be used to predict the coherency overhead of write-invalidate and write-broadcast protocols. We develop a simple model of write sharing from the write run characterization. By applying the results of the sharing analysis to the model, weighted by machine-specific cycle costs for carrying out coherency-related bus operations, we can estimate relative protocol performance. We compare these results to those from detailed architectural simulations. The simulation results indicate that (1) neither protocol dominates in performance; and that (2) the write run model is a good predictor of protocol performance when the unit of the coherency operations matches that in the sharing analysis. This is the case for the write-broadcast protocols, in which one word is broadcast for each write to shared data. However, in Berkeley Ownership, a write-invalidate protocol, the unit of coherency is an entire cache block. When the block size is large, performance for this protocol is quite sensitive to the memory reference patterns within the block.Keywords
This publication has 20 references indexed in Scilit:
- Topological Optimization of Multiple-Level Array LogicIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1987
- Logic verification algorithms and their parallel implementationPublished by Association for Computing Machinery (ACM) ,1987
- Cache coherence protocols: evaluation using a multiprocessor simulation modelACM Transactions on Computer Systems, 1986
- Performance analysis of multiprocessor cache consistency protocols using generalized timed Petri netsPublished by Association for Computing Machinery (ACM) ,1986
- Multis: A New Class of Multiprocessor ComputersScience, 1985
- The Dragon Computer SystemPublished by Springer Nature ,1985
- An economical solution to the cache coherence problemPublished by Association for Computing Machinery (ACM) ,1984
- Dynamic decentralized cache schemes for mimd parallel processorsPublished by Association for Computing Machinery (ACM) ,1984
- Analysis of Multiprocessors with Private Cache MemoriesIEEE Transactions on Computers, 1982
- IBM 3081 Processor Unit: Design Considerations and Design ProcessIBM Journal of Research and Development, 1982