Closure and convergence: a foundation of fault-tolerant computing
- 1 November 1993
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Software Engineering
- Vol. 19 (11) , 1015-1027
- https://doi.org/10.1109/32.256850
Abstract
The authors formally define what it means for a system to tolerate a class of faults. The definition consists of two conditions. The first is that if a fault occurs when the system state is within the set of legal states, the resulting state is within some larger set and, if faults continue to occur, the system state remains within that larger set (closure). The second is that if faults stop occurring, the system eventually reaches a state within the legal set (convergence). The applicability of the definition for specifying and verifying the fault-tolerance properties of a variety of digital and computer systems is demonstrated. Using the definition, the authors obtain a simple classification of fault-tolerant systems. Methods for the systematic design of such systems are discussed.<>Keywords
This publication has 25 references indexed in Scilit:
- Self-stabilizationACM Computing Surveys, 1993
- Stabilizing communication protocolsIEEE Transactions on Computers, 1991
- Understanding fault-tolerant distributed systemsCommunications of the ACM, 1991
- Uniform self-stabilizing ringsACM Transactions on Programming Languages and Systems, 1989
- Simulating authenticated broadcasts to derive simple fault-tolerant algorithmsDistributed Computing, 1987
- Impossibility of distributed consensus with one faulty processJournal of the ACM, 1985
- Fault Tolerance Terminology ProposalsPublished by Springer Nature ,1985
- Fail-stop processorsACM Transactions on Computer Systems, 1983
- Self-stabilizing systems in spite of distributed controlCommunications of the ACM, 1974
- Solution of a problem in concurrent programming controlCommunications of the ACM, 1965