Hardware for speculative parallelization of partially-parallel loops in DSM multiprocessors

1 January 1999

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 135-139
https://doi.org/10.1109/hpca.1999.744351

Abstract

Recently, we introduced a novel framework for speculative parallelization in hardware (Y. Zhang et al., 1998). The scheme is based on a software based run time parallelization scheme that we proposed earlier (L. Rauchwerger and D. Padue, 1995). The idea is to execute the code (loops) speculatively in parallel. As parallel execution proceeds, extra hardware added to the directory based cache coherence of the DSM machine detects if there is a dependence violation. If such a violation occurs, execution is interrupted, the state is rolled back in software to the most recent safe state, and the code is re-executed serially from that point. The safe state is typically established at the beginning of the loop. Such a scheme is somewhat related to speculative parallelization inside a multiprocessor chip, which also relies on extending the cache coherence protocol to detect dependence violations. Our scheme, however, is targeted to large scale DSM parallelism. In addition, it does not have some of the limitations of the proposed chip-multiprocessor schemes. Such limitations include the need to bound the size of the speculative state to fit in a buffer or L1 cache, and a strict in-order task commit policy that may result in load imbalance among processors. Unfortunately, our scheme has higher recovery costs if a dependence violation is detected, because execution has to backtrack to a safe state that is usually the beginning of the loop. Therefore, the aim of the paper is to extend our previous hardware scheme to effectively handle codes (loops) with a modest number of cross-iteration dependences.

Keywords

HARDWARE

This publication has 5 references indexed in Scilit:

Speculative versioning cache
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
The potential for using thread-level data speculation to facilitate automatic parallelization
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization
IEEE Transactions on Parallel and Distributed Systems, 1999
Data speculation support for a chip multiprocessor
Published by Association for Computing Machinery (ACM) ,1998
Hardware and software support for speculative execution of sequential binaries on a chip-multiprocessor
Published by Association for Computing Machinery (ACM) ,1998