On-line Fault Detection And Correction In Microprocessor Systems
- 24 August 2005
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
In most on-line diagnostic schemes whenever a fault is detected in a system, a rather involved system recovery routine is initiated irrespective of whether the fault is caused by a failure inside a chip, or by a failure outside a chip, say, on the bond connecting a pin to the chip. Failures of the latter type cause errors only when some information is being transferred from one chip to another chip. In this paper, 'two new techniques to system recovery are described for the case when an error is on any such data transfer path. These schemes are implementable locally, and the system is ensured to recover from any single stuck-at fault, single AND-bridge fault, or single OR-bridge fault in a single retry. The system- recovery from faults internal to chips can be per- formed using sophisticated routines. Thus, two- level approach to on-line system diagnosis seems to be more efficient.Keywords
This publication has 7 references indexed in Scilit:
- A METHODOLOGY FOR FUNCTIONAL LEVEL TESTING OF MICROPROCESSORSPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Fault Tolerance of a General Purpose Computer Implemented by Very Large Scale IntegrationIEEE Transactions on Computers, 1980
- Error Correction by Alternate-Data RetryIEEE Transactions on Computers, 1978
- A case study of C.mmp, Cm*, and C.vmp: Part I—Experiences with fault tolerance in multiprocessor systemsProceedings of the IEEE, 1978
- SIFT: Design and analysis of a fault-tolerant computer for aircraft controlProceedings of the IEEE, 1978
- Implementation of an Experimental Fault-Tolerant Memory SystemIEEE Transactions on Computers, 1976
- No. 1 ESS Maintenance PlanBell System Technical Journal, 1964