Bi-level reconfigurations of fault tolerant arrays in bi-modal computational environments
- 7 January 2003
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Fault-tolerant architectures and algorithms are studied for processor arrays which are subject to computational loads that alternate between two phases-a strict phase, characterized by a heavy load and strict constraints on response time, and a relaxed phase, characterized by a light load and relatively relaxed constraints on response time. Under this type of load, a bilevel algorithm may be applied to reconfigure the system after faults. Specifically, at one level, called the fast response level, a local distributed fault-tolerant algorithm is used during the strict phase to achieve fast fault recovery at the expense of possible rapid degradation in the potential to tolerate future faults. In order to minimize the effect of this degradation, a second level, called the optimization level, is added. At that level, a global, relatively slow reorganization algorithm is applied during the relaxed phase to restore the system into a shape that ensures adequate fault-tolerance capability in the remaining part of the system's mission. Three examples are given for bilevel reconfiguration algorithms that emphasize three different restoration criteria.Keywords
This publication has 12 references indexed in Scilit:
- Bi-level reconfigurations of fault tolerant arraysIEEE Transactions on Computers, 1992
- An evaluation of system-level fault tolerance on the Intel hypercube multiprocessorPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1988
- Distributed Fault-Tolerance of Tree StructuresIEEE Transactions on Computers, 1987
- Reconfigurable Tree Architectures Using Subtree Oriented Fault ToleranceIEEE Transactions on Computers, 1987
- A Fault-Tolerant Modular Architecture for Binary TreesIEEE Transactions on Computers, 1986
- Fault Tolerance Techniques for Array Structures Used in SupercomputingComputer, 1986
- Fault-Tolerant Computing—Concepts and ExamplesIEEE Transactions on Computers, 1984
- Algorithm-Based Fault Tolerance for Matrix OperationsIEEE Transactions on Computers, 1984
- The Diogenes Approach to Testable Fault-Tolerant Arrays of ProcessorsIEEE Transactions on Computers, 1983
- Reliability Modeling for Fault-Tolerant ComputersIEEE Transactions on Computers, 1971