General linear codes for fault-tolerant matrix operations on processor arrays

6 January 2003

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 180-185
https://doi.org/10.1109/ftcs.1988.5317

Abstract

Various checksum codes have been suggested for fault-tolerant matrix computations on processor arrays. Use of these codes is limited due to potential roundoff and overflow errors. Numerical errors may also be misconstrued as errors due to physical faults in the system. The authors identify a set of linear codes which can be used for fault-tolerant matrix operations such as matrix addition, multiplication, transposition, and LU-decomposition, with minium numerical error. Encoding schemes are given for some of the example codes which fall under the general set of codes. With the help of experiments, the authors derive a rule of thumb for the selection of a particular code for a given application. Since the overall error in the code will also depend on the method of implementation of the coding scheme, they suggest the use of specific algorithms and special hardware realizations for the check element computation.

Keywords

This publication has 0 references indexed in Scilit: