Probabilistic location and routing

25 June 2003

proceedings article
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 3, 1248-1257
https://doi.org/10.1109/infcom.2002.1019375

Abstract

We propose probabilistic location to enhance the per- formance of existing peer-to-peer location mechanisms in the case where a replica for the queried data item exists close to the query source. We introduce the attenuated Bloom filter , a lossy dis- tributed index. We describe how to use these data structures for document location and how to maintain them despite document motion. We include a detailed performance study which indicates that our algorithm performs as desired, both finding closer repli- cas and finding them faster than deterministic algorithms alone. I. INTRODUCTION Today's exponential growth in network bandwidth and stor- age capacity has inspired a whole new class of distributed, peer- to-peer storage infrastructures. Systems such as Farsite ( 1), Freenet (2), Intermemory (3), OceanStore (4), CFS (5), and PAST (6) seek to capitalize on the rapid growth of resources to provide inexpensive, highly-available storage without ce ntral- ized servers. The designers of these systems propose to achi eve high availability and long-term durability in the face of in di- vidual component failures through replication and coding tech- niques. Although wide-scale replication has the potential to incre ase availability and durability, it introduces two important c hal- lenges to system architects. First, if replicas may be place d anywhere in the system, how should we locate them? Sec- ond, once we have located one or more replicas, how should we route queries to them? We can formulate the combination of these two operations as a single location and routingproblem that efficiently routes queries from a client to the closest r eplica adhering to certain properties, such as the replica with the short- est network path to the client or the replica residing on the l east loaded server. In many cases, combining location and routing into a single, compound operation yields the greatest flexib ility to route queries quickly with minimal network overhead. The importance of such location-independent routingtechniques is well recognized in the community, and several proposals such as CAN (7), Chord (8), Pastry (9), and Tapestry (10) are cur- rently under study. These existing schemes share the characteristic that in the worst case, a location and routing operation requires O(logN) sequential network messages to search a distributed system of N servers. Some of these algorithms use substantially fewer messages to perform their task in the common case. This scala- bility is commendable, and it allows for the total query rout ing time to be close to optimal when the replica is far from the query source. However, as the replica approaches the location of the query source, the performance of the existing algorit hms quickly diverges from optimality. This divergence is easy t o

Keywords

This publication has 14 references indexed in Scilit:

Generalizing "search" in generalized search trees
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A scalable content-addressable network
Published by Association for Computing Machinery (ACM) ,2001
Wide-area cooperative storage with CFS
Published by Association for Computing Machinery (ACM) ,2001
A prototype implementation of archival Intermemory
Published by Association for Computing Machinery (ACM) ,1999
Cache digests
Computer Networks and ISDN Systems, 1998
Algorithmic Design of the Globe Wide-Area Location Service
The Computer Journal, 1998
The LRU-K page replacement algorithm for database disk buffering
Published by Association for Computing Machinery (ACM) ,1993
The DASH prototype: Logic overhead and performance
IEEE Transactions on Parallel and Distributed Systems, 1993
Development of the domain name system
Published by Association for Computing Machinery (ACM) ,1988
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM, 1970