Accelerated warmup for sampled microarchitecture simulation
- 1 March 2005
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Architecture and Code Optimization
- Vol. 2 (1) , 78-108
- https://doi.org/10.1145/1061267.1061272
Abstract
To reduce the cost of cycle-accurate software simulation of microarchitectures, many researchers use statistical sampling: by simulating only a small, representative subset of the end-to-end dynamic instruction stream in cycle-accurate detail, simulation results complete in much less time than simulating the cycle-by-cycle progress of an entire benchmark. In order for sampled simulation results to accurately reflect the nature the full dynamic instruction stream, however, state in the simulated cache and branch predictor must match or closely approximate state as it would have appeared had cycle-accurate simulation been used for the entire simulation. Researchers typically address this issue by prefixing a period of warmup---in which cache and branch predictor state are modeled in addition to programmer-visible architected state---to each cluster of contiguous instructions in the sample.One conservative, but slow approach is to always simulate cache and branch predictor state, whether among the cycle-accurate clusters, or among the instructions preceding each cluster. To save time, warmup heuristics have been proposed, but there is no one-size-fits-all heuristic for any benchmark. More rigorous, analytical warmup approaches are necessary in order to balance the requirements of high accuracy and rapidity from sampled simulations. This paper explores this issue and in particular demonstrates the merits of memory reference reuse latency (MRRL).Relative to the IPC measured by modeling all precluster cache and branch predictor activity, MRRL generated an average error in IPC of less than 1% and simultaneously reduced simulation running times by an average of approximately 50% (or 95% of the maximum potential speedup).Keywords
This publication has 4 references indexed in Scilit:
- Branch prediction, instruction-window size, and cache size: performance trade-offs and simulation techniquesIEEE Transactions on Computers, 1999
- The SimpleScalar tool set, version 2.0ACM SIGARCH Computer Architecture News, 1997
- Practical SamplingPublished by SAGE Publications ,1990
- The Advanced Theory of Statistics.Journal of the Royal Statistical Society: Series D (The Statistician), 1968