Low cost management of replicated data in fault-tolerant distributed systems

10 February 1986

journal article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Computer Systems

Vol. 4 (1) , 54-70
https://doi.org/10.1145/6306.6309

Abstract

Many distributed systems replicate data for fault tolerance or availability. In such systems, a logical update on a data item results in a physical update on a number of copies. The synchronization and communication required to keep the copies of replicated data consistent introduce a delay when operations are performed. In this paper, we describe a technique that relaxes the usual degree of synchronization, permitting replicated data items to be updated concurrently with other operations, while at the same time ensuring that correctness is not violated. The additional concurrency thus obtained results in better response time when performing operations on replicated data. We also discuss how this technique performs in conjunction with a roll-back and a roll-forward failure recovery mechanism.

Keywords

This publication has 9 references indexed in Scilit:

Determining the last process to fail
ACM Transactions on Computer Systems, 1985
Reliable broadcast protocols
ACM Transactions on Computer Systems, 1984
Fault-tolerant broadcasts
Science of Computer Programming, 1984
Fail-stop processors
ACM Transactions on Computer Systems, 1983
Concurrency Control in Distributed Database Systems
ACM Computing Surveys, 1981
A Survey of Techniques for Synchronization and Recovery in Decentralized Computer Systems
ACM Computing Surveys, 1981
The serializability of concurrent database updates
Journal of the ACM, 1979
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM, 1978
Notes on data base operating systems
Published by Springer Nature ,1978