Abstract
Data cache performance is critical to overall processor performance as the latency gap between CPU core and main memory increases. Studies have shown that some loads have latency demands that allow them to be serviced from slower portions of memory, thus allowing more critical data to be kept in higher levels of the cache. We provide a strategy for identifying this latency-tolerant data at runtime and, using simple heuristics, keep it out of the main cache and place it instead in a small, parallel, associative buffer. Using such a "Non-Critical Buffer" dramatically improves the hit rate for more critical data, and leads to a performance improvement comparable to or better than other traditional cache improvement schemes. IPC improvements of over 4% are seen for some benchmarks.

This publication has 11 references indexed in Scilit: