Massively parallel systems you can trust
- 17 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Massively parallel systems are needed to address the performance requirements of many current and future commercial applications. These applications require systems that can be trusted protect the integrity of data, to continue operation in spite of failures, and to provide scalable performance to serve ever-increasing customer needs. The design of current scientific MPPs has not considered many of these commercial requirements. Tandem has developed systems that have been designed specifically to address both the reliability and performance requirements for large commercial applications. The recently introduced Himalaya Range with the TorusNet interconnection architecture extends the capabilities of Tandem systems into the range of massively parallel systems. Recent benchmarks have demonstrated both the scalability and fault-tolerance of these systems.Keywords
This publication has 3 references indexed in Scilit:
- The risk of data corruption in microprocessor-based systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Faults, symptoms, and software fault tolerance in the Tandem GUARDIAN90 operating systemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A census of Tandem system availability between 1985 and 1990IEEE Transactions on Reliability, 1990