Greedy Diagnosis as the Basis of an Intermittent-Fault/ Transient-Upset Tolerant System Design

Abstract
Multiple-unit computer systems which are to be tolerant of intermittently faulty units or transiently upset units are considered in this paper. Designs for such systems, which exploit a new so-called greedy diagnosis theory, are developed. Using greedy diagnosis, assessments on the condition of a unit (intermittent-fault case) or the integrity of data (transient-upset case) can be made on the basis of syndromes formed from comparisons of the results of jobs performed by pairs of units. Greedy diagnosis avoids the requirement that for such syndromes to be useful, they must be interpretable from a permanent-fault/continuous-upset perspective.

This publication has 8 references indexed in Scilit: