Utilizing spares in multichip modules for the dual function of fault coverage and fault diagnosis
- 19 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Defining a dual role for spare processing elements (PEs) in reliability-challenged processing arrays is the major focus of the paper. The paper also explores a practical way to include reconfiguration hardware in single-package arrays. The implementation of array processor systems may include spare PE's for fault tolerance. These systems typically require a host for fault diagnosis, while the healthy spares sit idle. It is proposed to utilize the idling spare PEs for purposes of fault diagnosis, giving the array the capability of self diagnosis. Fault tolerance must incorporate additional hardware for reconfiguration, and existing plans have not found widespread use in single-package systems due to the extra cost and extra real estate. Multichip modules (MCMs) have the potential to offer fault tolerance with no increase in primary circuit area. It is proposed to contain the reconfiguration hardware in the active substrate of a silicon-based MCM. Further, the switches required for spares coverage can aid in the job of comparison based self-testing. We offer a complete solution to fault-tolerant arrays in the sense that diagnosis, reconfiguration and switching details are all addressed.Keywords
This publication has 11 references indexed in Scilit:
- Efficient utilization of spare capacity for fault detection and location in multiprocessor systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Are there any alternatives to "known good die" ? [MCMs]Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A silicon-on-silicon multichip module technology with integrated bipolar components in the substratePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- What designers of wafer scale systems should know about local sparingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Multichip Module Technologies and Alternatives: The BasicsPublished by Springer Nature ,1993
- Built-in testing of integrated circuit wafersIEEE Transactions on Computers, 1990
- Spare capacity as a means of fault detection and diagnosis in multiprocessor systemsIEEE Transactions on Computers, 1989
- On implementing large binary tree architectures in VLSI and WSIIEEE Transactions on Computers, 1989
- Schemes for fault-tolerant computing: A comparison of modularly redundant and t-diagnosable systemsInformation and Control, 1981
- Diagnosable Systems for Intermittent FaultsIEEE Transactions on Computers, 1978