Effects of multithreading on cache performance
- 1 January 1999
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Computers
- Vol. 48 (2) , 176-184
- https://doi.org/10.1109/12.752659
Abstract
As the performance gap between processor and memory grows, memory latency becomes a major bottleneck in achieving high processor utilization. Multithreading has emerged as one of the most promising and exciting techniques used to tolerate memory latency by exploiting thread-level parallelism. The question, however, remains as to how effective multithreading is on tolerating memory latency. The performance of multithreading is not only affected by the overlapping of memory latency with useful computation, but also strongly depends on the cache behavior and the overhead of multithreading (e.g., thread management and context-switch costs). In particular, multithreading affects the behavior of caches, and, thus, the overall performance in a nontrivial fashion. To study these issues, this paper presents the Multithreaded Virtual Processor (MVP) model. MVP integrates the multithreaded programming paradigm and a modern superscalar processor with support for fast context switching and thread scheduling. Our studies with MVP show that, in general, the performance improvements are obtained not only by tolerating memory latency but also lower cache miss rates due to exploitation of data locality.However, multithreading creates an additional stress on the memory hierarchy caused by the interference among threads. Also, the dynamic behavior of multithreaded execution hinders the instruction locality that results in a high number of misses in the L1 instruction cache.Keywords
This publication has 18 references indexed in Scilit:
- PA7200: a PA-RISC processor with integrated high performance MP bus interfacePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A performance study of software and hardware data prefetching schemesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The SPLASH-2 programs: characterization and methodological considerationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreadingACM Transactions on Computer Systems, 1997
- Multithreading with distributed functional unitsIEEE Transactions on Computers, 1997
- Thread scheduling for cache localityPublished by Association for Computing Machinery (ACM) ,1996
- Tile size selection using cache organization and data layoutPublished by Association for Computing Machinery (ACM) ,1995
- The effectiveness of multiple hardware contextsPublished by Association for Computing Machinery (ACM) ,1994
- Sparcle: an evolutionary processor design for large-scale multiprocessorsIEEE Micro, 1993
- The Tera computer systemPublished by Association for Computing Machinery (ACM) ,1990