Is SC + ILP = RC?
- 1 May 1999
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News
- Vol. 27 (2) , 162-171
- https://doi.org/10.1145/307338.300993
Abstract
Sequential consistency (SC) is the simplest programming interface for shared-memory systems but imposes program order among all memory operations, possibly precluding high performance implementations. Release consistency (RC), however, enables the highest performance implementations but puts the burden on the programmer to specify which memory operations need to be atomic and in program order. This paper shows, for the first time, that SC implementations can perform as well as RC implementations if the hardware provides enough support for speculation. Both SC and RC implementations rely on reordering and overlapping memory operations for high performance. To enforce order when necessary, an RC implementation uses software guarantees, whereas an SC implementation relies on hardware speculation. Our SC implementation, called SC++, closes the performance gap because: (1) the hardware allows not just loads, as some current SC implementations do, but also stores to bypass each other speculatively to hide remote latencies, (2) the hardware provides large speculative state for not just processor, as previously proposed, but also memory to allow out-of-order memory operations, (3) the support for hardware speculation does not add excessive overheads to processor pipeline critical paths, and (4) well-behaved applications incur infrequent rollbacks of speculative execution. Using simulation, we show that SC++ achieves an RC implementation's performance in all the six applications we studied.Keywords
This publication has 14 references indexed in Scilit:
- Multiprocessors should support simple memory consistency modelsComputer, 1998
- Complexity-effective superscalar processorsPublished by Association for Computing Machinery (ACM) ,1997
- Using speculative retirement and larger instruction windows to narrow the performance gap between memory consistency modelsPublished by Association for Computing Machinery (ACM) ,1997
- The Mips R10000 superscalar microprocessorIEEE Micro, 1996
- Shared memory consistency models: a tutorialComputer, 1996
- The SPLASH-2 programsPublished by Association for Computing Machinery (ACM) ,1995
- The Stanford Dash multiprocessorComputer, 1992
- Memory consistency and event ordering in scalable shared-memory multiprocessorsPublished by Association for Computing Machinery (ACM) ,1990
- Weak ordering---a new definitionPublished by Association for Computing Machinery (ACM) ,1990
- How to Make a Multiprocessor Computer That Correctly Executes Multiprocess ProgramsIEEE Transactions on Computers, 1979