Distributed Uniform Sampling in Unstructured Peer-to-Peer Networks

1 January 2006

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. ucb csd 10 (15301605) , 223c
https://doi.org/10.1109/hicss.2006.126

Abstract

Uniform sampling in networks is at the core of a wide variety of randomized algorithms. Random sampling can be performed by modeling the system as an undirected graph with associated transition probabilities and defining a corresponding Markov chain (MC). A random walk of prescribed minimum length, performed on this graph, yields a stationary distribution, and the corresponding random sample. This sample, however, is not uniform when network nodes have a non-uniform degree distribution. This poses a significant practical challenge since typical large scale real-world unstructured networks tend to have non-uniform degree distributions, e.g., power-law degree distribution in unstructured peer-to-peer networks. In this paper, we present a distributed algorithm that enables efficient uniform sampling in large unstructured non-uniform networks. Specifically, we prescribe necessary conditions for uniform sampling in such networks and present distributed algorithms that satisfy these requirements. We empirically evaluate the performance of our algorithm in comparison to known algorithms. The performance parameters include computational complexity, length of random walk, and uniformity of the sampling. Simulation results support our claims of performance improvements due to our algorithm.

Keywords

This publication has 23 references indexed in Scilit:

Randomized leader election
Distributed Computing, 2007
Unstructured peer-to-peer networks for sharing processor cycles
Parallel Computing, 2006
Choosing a random peer
Published by Association for Computing Machinery (ACM) ,2004
Simple efficient load balancing algorithms for peer-to-peer systems
Published by Association for Computing Machinery (ACM) ,2004
Coping with irregular spatio-temporal sampling in sensor networks
ACM SIGCOMM Computer Communication Review, 2004
Fastest Mixing Markov Chain on a Graph
SIAM Review, 2004
Estimating network size from local information
Information Processing Letters, 2003
Measuring and analyzing the characteristics of Napster and Gnutella hosts
Multimedia Systems, 2003
On near-uniform URL sampling
Computer Networks, 2000
Monte Carlo sampling methods using Markov chains and their applications
Biometrika, 1970