Abstract
The authors formally define what it means for a system to tolerate a class of faults. The definition consists of two conditions. The first is that if a fault occurs when the system state is within the set of legal states, the resulting state is within some larger set and, if faults continue to occur, the system state remains within that larger set (closure). The second is that if faults stop occurring, the system eventually reaches a state within the legal set (convergence). The applicability of the definition for specifying and verifying the fault-tolerance properties of a variety of digital and computer systems is demonstrated. Using the definition, the authors obtain a simple classification of fault-tolerant systems. Methods for the systematic design of such systems are discussed.<>

This publication has 25 references indexed in Scilit: