The UCLA mirror processor: a building block for self-checking self-repairing computing nodes

Abstract
The design and implementation of a RISC microprocessor, called the UCLA mirror processor, which is capable of micro rollback, are reported. Two mirror processors operating in lock step achieve concurrent error detection by comparing external signals and a signature of internal signals every clock cycle. A mismatch causes both processors to roll back to the beginning of the cycle in which the error occurred. In some cases an erroneous state is corrected by copying a value from the fault-free processor to the faulty processor. The architecture, microarchitecture, and VLSI implementation of the mirror processor, with an emphasis on its error-detection and error-recovery capabilities, are described. The overhead and design issues encountered are evaluated. It is shown that micro rollback can be implemented in a practical VLSI chip and is a practical technique for minimizing the latencies normally associated with concurrent error detection.<>

This publication has 14 references indexed in Scilit: