NFTAPE: a framework for assessing dependability in distributed systems with lightweight fault injectors
- 7 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Many fault injection tools are available for dependability assessment. Although these tools are good at injecting a single fault model into a single system, they suffer from two main limitations for use in distributed systems: (1) no single tool is sufficient for injecting all necessary fault models; (2) it is difficult to port these tools to new systems. NFTAPE, a tool for composing automated fault injection experiments from available lightweight fault injectors, triggers, monitors, and other components, helps to solve these problems. We have conducted experiments using NFTAPE with several types of lightweight fault injectors, including driver-based, debugger-based, target-specific, simulation-based, hardware-based, and performance-fault injections. Two example experiments are described in this paper. The first uses a hardware fault injector with a Myrinet LAN; the other uses a Software Implemented Fault Injection (SWIFI) fault injector to target a space-imaging application.Keywords
This publication has 16 references indexed in Scilit:
- Understanding large system failures-a fault injection experimentPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- FIAT-fault injection based automated testing environmentPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Fault injection into VHDL models: the MEFISTO toolPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A technique for automated validation of fault tolerant designs using laser fault injection (LFI)Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Chameleon: a software infrastructure for adaptive fault toleranceIEEE Transactions on Parallel and Distributed Systems, 1999
- Dependability analysis of a high-speed network using software-implemented fault injection and simulated fault injectionIEEE Transactions on Computers, 1998
- Implementing fail-silent nodes for distributed systemsIEEE Transactions on Computers, 1996
- Myrinet: a gigabit-per-second local area networkIEEE Micro, 1995
- Using heavy-ion radiation to validate fault-handling mechanismsIEEE Micro, 1994
- FERRARI: a tool for the validation of system dependability propertiesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992