Distributed execution of recovery blocks: an approach for uniform treatment of hardware and software faults in real-time applications
- 1 May 1989
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Computers
- Vol. 38 (5) , 626-636
- https://doi.org/10.1109/12.24266
Abstract
The concept of distributed execution of recovery blocks is examined as an approach for uniform treatment of hardware and software faults. A useful characteristic of the approach is the relatively small time cost it requires. The approach is thus suitable for incorporation into real-time computer systems. A specific formulation of the approach that is aimed at minimizing the recovery time is presented, called the distributed recovery blocks scheme. The DRB scheme is capable of effecting forward recovery while handling both hardware and software faults in a uniform manner. An approach to incorporating the capability for distributed execution of recovery blocks into a load-sharing multiprocessing scheme is also discussed. Two experiments aimed at testing the execution efficiency of the scheme in real-time applications have been conducted on two different multimicrocomputer networks. The results clearly indicate the feasibility of achieving tolerance of hardware and software faults.<>Keywords
This publication has 12 references indexed in Scilit:
- Strategies for structured and fault-tolerant design of recovery programsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- THE ARCHITECTURE OF MARSPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- STAREX SELF-REPAIR ROUTINES: SOFTWARE RECOVERY IN THE JPL-STAR COMPUTERPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- The process design system (PDS)Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Fault-Tolerant Systems in Commercial ApplicationsComputer, 1984
- A flexible distributed testbed for real-time applicationsComputer, 1982
- Ballistic Missile Defense: A Supercomputer ChallengeComputer, 1980
- Fault-Tolerant Software for Real-Time ApplicationsACM Computing Surveys, 1976
- System structure for software fault toleranceIEEE Transactions on Software Engineering, 1975
- Rollback and Recovery Strategies for Computer ProgramsIEEE Transactions on Computers, 1972