Virtually-Synchronous Communication Based on a Weak Failure Suspector

Abstract
Failure detectors (or, more accurately Failure Suspectors - FS) appear to be a fundamental service upon which to build fault-tolerant, distributed applications. This paper shows that a FS with very weak semantics (i.e. that delivers failure and recovery information in no specific order) suffices to implement virtually-synchronous communication (VSC) in an asynchronous system subject to process crash failures and network partitions. The VSC paradigm is particularly useful in asynchronous systems and greatly simplifies building fault-tolerant applications that mask failures by replicating processes. We suggest a three-component architecture to implement virtually-synchronous communication: (1) at the lowest level, the FS component; on top of it, (2a) a component that defines new views, and (2b) a component that reliably multicasts messages within a view. The issues covered in this paper also lead to a better understanding of the various membership service semantics proposed in recent literature.

This publication has 0 references indexed in Scilit: