Memory System Design for Tolerating Single Event Upsets

Abstract
This paper presents a new memory system design which employs fault-tolerant design techniques for tolerating errors due to single event radiation upsets. The classical Triple Modular Redundancy (TMR) technique has extremely high insertion cost (a factor of 3 to 4); duplication techniques can detect, but not correct, errors and have a factor of 2 to 3 insertion cost. Data encoding techniques have low cost but can correct only stored data, not control, errors. What is needed therefore is an effective mixture of known and new techniques to achieve adequately high reliability at minimum cost. The proposed memory system design uses coding, control duplication, and scrubbing for tolerating soft errors from single event upsets. This memory permits the use of less costly conventional unhardened memory technology and has a very low insertion cost of approximately 25% for achieving fault tolerance. Furthermore, events which upset control logic are tolerated as well as those which affect stored data. This approach therefore offers a cost trade-off with respect to hardened technology memory and tolerates combinational control logic upsets which some hardening techniques cannot tolerate.

This publication has 13 references indexed in Scilit: