The UCLA mirror processor: a building block for self-checking self-repairing computing nodes
- 10 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 178-185
- https://doi.org/10.1109/ftcs.1991.146658
Abstract
The design and implementation of a RISC microprocessor, called the UCLA mirror processor, which is capable of micro rollback, are reported. Two mirror processors operating in lock step achieve concurrent error detection by comparing external signals and a signature of internal signals every clock cycle. A mismatch causes both processors to roll back to the beginning of the cycle in which the error occurred. In some cases an erroneous state is corrected by copying a value from the fault-free processor to the faulty processor. The architecture, microarchitecture, and VLSI implementation of the mirror processor, with an emphasis on its error-detection and error-recovery capabilities, are described. The overhead and design issues encountered are evaluated. It is shown that micro rollback can be implemented in a practical VLSI chip and is a practical technique for minimizing the latencies normally associated with concurrent error detection.<>Keywords
This publication has 14 references indexed in Scilit:
- The implementation and application of micro rollback in fault-tolerant VLSI systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- High-performance fault-tolerant VLSI systems using micro rollbackIEEE Transactions on Computers, 1990
- Implementing precise interrupts in pipelined processorsIEEE Transactions on Computers, 1988
- Checkpoint repair for out-of-order execution machinesPublished by Association for Computing Machinery (ACM) ,1987
- A 32-bit NMOS microprocessor with a large register fileIEEE Journal of Solid-State Circuits, 1984
- The Intel 432: A VLSI Architecture for Fault-Tolerant Computer SystemsComputer, 1984
- Arbitration and Control Acquisition in the Proposed IEEE 896 FuturebusIEEE Micro, 1984
- Reliability Issues in Computing System DesignACM Computing Surveys, 1978
- Architectures for fault-tolerant spacecraft computersProceedings of the IEEE, 1978
- No. 1 ESS Maintenance PlanBell System Technical Journal, 1964