Fault tolerant distributed shared memory algorithms

Abstract
Distributed shared memory (DSM) has received increased attention as a mechanism for interprocess communication in loosely-coupled distributed systems because of its perceived advantages over direct use of message passing or remote procedure calls. One problem with most DSM algorithms proposed to date, however, is that they do not tolerate faults. The paper extends four basic DSM algorithms to tolerate single host failures and argues that this degree of fault tolerance is sufficient for most applications. It analyzes the performance behavior of the fault tolerant DSM algorithms and shows that for some algorithms the additional overhead for fault tolerance is quite small, but that for other algorithms the extra overhead can be substantial and even unpredictable.

This publication has 13 references indexed in Scilit: