Simple models of hardware and software fault tolerance
- 17 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. se 13, 124-129
- https://doi.org/10.1109/rams.1994.291094
Abstract
This paper presents a quantitative analysis of three different architectural approaches to the integration of hardware and software fault tolerance. Using a common set of assumptions, and hypothetical parameter values, the authors compare the reliability of DRB (Distributed Recovery Blocks), NVP (N-version programming) and NSCP (N self-checking Programming). A combination of fault trees and Markov reward models is used to consider transient and permanent physical faults, and independent and related software faults. The fault tree models capture the combinations of software faults and hardware transients that can upset a single task computation. The structure states of the Markov reward process captures the longer term behavior of the system as it is reconfigured in response to permanent faults. In addition to a base case, several different scenarios are considered, including perfect specifications, independent versions, perfect decider and perfect coverage. For most cases, DRB is found to be the most reliable.Keywords
This publication has 13 references indexed in Scilit:
- Hardware and software fault tolerance: a unified architectural approachPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- X-ware reliability and availability modelingIEEE Transactions on Software Engineering, 1992
- Reliability estimation of fault-tolerant systems: tools and techniquesComputer, 1990
- Distributed execution of recovery blocks: an approach for uniform treatment of hardware and software faults in real-time applicationsIEEE Transactions on Computers, 1989
- Survey of software tools for evaluating reliability, availability, and serviceabilityACM Computing Surveys, 1988
- Reliability Modeling Using SHARPEIEEE Transactions on Reliability, 1987
- Fault-Tolerant SoFtware Reliability ModelingIEEE Transactions on Software Engineering, 1987
- Evaluation of Error Recovery Blocks Used for Cooperating ProcessesIEEE Transactions on Software Engineering, 1984
- Dependability Evaluation of Software Systems in OperationIEEE Transactions on Software Engineering, 1984
- System structure for software fault toleranceIEEE Transactions on Software Engineering, 1975