High-availability computer systems
- 1 September 1991
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in Computer
- Vol. 24 (9) , 39-48
- https://doi.org/10.1109/2.84898
Abstract
The techniques used to build highly available computer systems are sketched. Historical background is provided, and terminology is defined. Empirical experience with computer failure is briefly discussed. Device improvements that have greatly increased the reliability of digital electronics are identified. Fault-tolerant design concepts and approaches to fault-tolerant hardware are outlined. The role of repair and maintenance and of design-fault tolerance is discussed. Software repair is considered. The use of pairs of computer systems at separate locations to guard against unscheduled outages due to outside sources (communication or power failures, earthquakes, etc.) is addressed.Keywords
This publication has 7 references indexed in Scilit:
- DEPENDABLE COMPUTING AND FAULT TOLERANCE : CONCEPTS AND TERMINOLOGYPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- A census of Tandem system availability between 1985 and 1990IEEE Transactions on Reliability, 1990
- On the Reliability of the IBM MVS/XA Operating SystemIEEE Transactions on Software Engineering, 1987
- The Evolution of Fault-Tolerant ComputingPublished by Springer Nature ,1987
- The N-Version Approach to Fault-Tolerant SoftwareIEEE Transactions on Software Engineering, 1985
- Optimizing Preventive Service of Software ProductsIBM Journal of Research and Development, 1984
- Reliability Issues in Computing System DesignACM Computing Surveys, 1978