The Soft Error Problem: An Architectural Perspective

Radiation-induced soft errors have emerged as a key challenge in computer system design. If the industry is to continue to provide customers with the level of reliability they expect, microprocessor architects must address this challenge directly. This effort has two parts. First, architects must understand the impact of soft errors on their designs. Second, they must select judiciously from among available techniques to reduce this impact in order to meet their reliability targets with minimum overhead. To provide a foundation for these efforts, this paper gives a broad overview of the soft error problem from an architectural perspective. We start with basic definitions, followed by a description of techniques to compute the soft error rate. Then, we summarize techniques used to reduce the soft error rate. This paper also describes problems with double-bit errors. Finally, this paper outlines future directions for architecture research in soft errors.

This publication has 13 references indexed in Scilit: