Abstract
Two solutions for the surveillance problem that are based on an election algorithm which has to cope with process and communication failures are described. The election algorithm is presented in detail. The surveillance algorithms are simple and efficient: the central crash detection protocol requires n+1 messages for each surveillance period (assuming that n is the number of processes to keep under surveillance), and the distributed approach requires n messages. If the distributed crash detection approach is used, the election algorithm has to be executed after each crash detection to determine a new ring manager which generates a new token and establishes the virtual ring. In case of a crash detection with the central protocol, a new crash detection manager has to be determined only if the old manager has failed.

This publication has 11 references indexed in Scilit: