A Unified Method for Analyzing Mission Reliability for Fault Tolerant Computer Systems
- 1 June 1973
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Reliability
- Vol. R-22 (2) , 72-77
- https://doi.org/10.1109/tr.1973.5216037
Abstract
A reliability model is proposed and evaluated for a fault tolerant computer system which consists of multiple classes of modules and allows for degraded modes of performance. Each module of a given class has both an active and a passive hazard rate; constant hazard rates are assumed for active and dormant failures, and the given class may operate either in N Modular Redundancy (NMR: n + 1 out of 2n + 1 = N) or as a standby sparing system. The model allows for mission-phase changes at deterministic time points when the numbers of modules per class can be changed. The analysis proceeds by generalizing the notions of standby and NMR redundancy, which for N = 3 is TMR (Triple Modular Redundancy), into a concept called hybrid-degraded redundancy. The probabilistic evaluation of the unified redundancy concept is then developed to yield, for a given modular class, the joint distribution of success and the number of nonfailed modules from that class, at special times. With this information, a Markov chain analysis gives the reliability of an entire sequence of phases (mission profile).Keywords
This publication has 7 references indexed in Scilit:
- A Reliability and Comparative Analysis of Two Standby System ConfigurationsIEEE Transactions on Reliability, 1973
- Reliability Modeling for Fault-Tolerant ComputersIEEE Transactions on Computers, 1971
- On Reliability Modeling and Analysis of Ultrareliable Fault-Tolerant Digital SystemsIEEE Transactions on Computers, 1971
- Reliability analysis and architecture of a hybrid-redundant digital systemPublished by Association for Computing Machinery (ACM) ,1970
- Reliability modeling techniques for self-repairing computer systemsPublished by Association for Computing Machinery (ACM) ,1969
- A REDUNDANCY TECHNIQUE FOR IMPROVING THE RELIABILITY OF DIGITAL SYSTEMSPublished by Defense Technical Information Center (DTIC) ,1963
- Upper Bounds on Mean Life of Self-Repairing SystemsIRE Transactions on Reliability and Quality Control, 1962