Improving storage system availability with D-GRAID

1 May 2005

journal article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Storage

Vol. 1 (2) , 133-170
https://doi.org/10.1145/1063786.1063787

Abstract

We present the design, implementation, and evaluation of D-GRAID, a gracefully degrading and quickly recovering RAID storage array. D-GRAID ensures that most files within the file system remain available even when an unexpectedly high number of faults occur. D-GRAID achieves high availability through aggressive replication of semantically critical data, and fault-isolated placement of logically related data. D-GRAID also recovers from failures quickly, restoring only live file system data to a hot spare. Both graceful degradation and live-block recovery are implemented in a prototype SCSI-based storage system underneath unmodified file systems, demonstrating that powerful “file-system like” functionality can be implemented within a “semantically smart” disk system behind a narrow block-based interface.

Keywords

This publication has 20 references indexed in Scilit:

Interposed request routing for scalable network storage
ACM Transactions on Computer Systems, 2002
Soft updates
ACM Transactions on Computer Systems, 2000
The HP AutoRAID hierarchical storage system
ACM Transactions on Computer Systems, 1996
Hive
Published by Association for Computing Machinery (ACM) ,1995
RAID: high-performance, reliable secondary storage
ACM Computing Surveys, 1994
The design and implementation of a log-structured file system
ACM Transactions on Computer Systems, 1992
Disconnected operation in the Coda File System
ACM Transactions on Computer Systems, 1992
Comparison of sparing alternatives for disk arrays
Published by Association for Computing Machinery (ACM) ,1992
Garbage collection in an uncooperative environment
Software: Practice and Experience, 1988
A fast file system for UNIX
ACM Transactions on Computer Systems, 1984