Probabilistic reliable dissemination in large-scale systems

Abstract
The growth of the Internet raises new challenges for the design of distributed systems and applications. In the context of group communication protocols, gossip-based schemes have attracted interest as they are scalable, easy to deploy, and resilient to network and process failures. However, traditional gossip-based protocols have two major drawbacks: 1) they rely on each peer having knowledge of the global membership; and 2) being oblivious to the network topology, they can impose a high load on network links when applied to wide-area settings. In this paper, we provide a theoretical analysis of gossip-based protocols which relates their reliability to key system parameters (the system size, failure rates, and number of gossip targets). The results provide guidelines for the design of practical protocols. In particular, they show how reliability can be maintained while alleviating drawback by: 1) providing each peer with only a small subset of the total membership information and drawback; and 2) organizing members into a hierarchical structure that reflects their proximity according to some network-related metric. We validate the analytical results by simulations and verify that the hierarchical gossip protocol considerably reduces the load on the network compared to the original, non-hierarchical protocol.

This publication has 19 references indexed in Scilit: