Keeping processes under surveillance

10 December 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 198-205
https://doi.org/10.1109/reldis.1991.145424

Abstract

Two solutions for the surveillance problem that are based on an election algorithm which has to cope with process and communication failures are described. The election algorithm is presented in detail. The surveillance algorithms are simple and efficient: the central crash detection protocol requires n+1 messages for each surveillance period (assuming that n is the number of processes to keep under surveillance), and the distributed approach requires n messages. If the distributed crash detection approach is used, the election algorithm has to be executed after each crash detection to determine a new ring manager which generates a new token and establishes the virtual ring. In case of a crash detection with the central protocol, a new crash detection manager has to be determined only if the old manager has failed.

Keywords

This publication has 11 references indexed in Scilit:

Agreeing on who is present and who is absent in a synchronous distributed system
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
A fault tolerance approach for distributed ISDN control systems
Published by Association for Computing Machinery (ACM) ,1990
Synchronous atomic broadcast for redundant broadcast channels
Real-Time Systems, 1990
Optimal distributed t-resilient election in complete networks
IEEE Transactions on Software Engineering, 1990
Extended impossibility results for asynchronous complete networks
Information Processing Letters, 1987
Reliable communication in the presence of failures
ACM Transactions on Computer Systems, 1987
Impossibility of distributed consensus with one faulty process
Journal of the ACM, 1985
Reliable broadcast protocols
ACM Transactions on Computer Systems, 1984
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems, 1982
Elections in a Distributed Computing System
IEEE Transactions on Computers, 1982