Fault-tolerant distributed systems based on broadcast communication
- 7 January 2003
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 129-134
- https://doi.org/10.1109/icdcs.1989.37940
Abstract
Distributed systems present problems of maintaining consistency of distributed data in the presence of faults. These problems are currently solved by agreement protocols that require many messages to be exchanged between processors with adverse effects on system performance. An approach is presented to the design of fault-tolerant distributed systems that avoids this message exchange, resulting in systems that are substantially more efficient. This approach is based on broadcast communication over a local area network such as the Ethernet, and on two novel protocols: the Trans protocol which provides efficient reliable broadcast communication, and the Total protocol which, with high probability, promptly takes a total order on messages and achieves distributed agreement even in the presence of a fault. Reliable distributed operations, such as locking, update, and commitment, require only a single broadcast message rather than the several tens of messages required by current algorithms.Keywords
This publication has 13 references indexed in Scilit:
- Camelot: a flexible, distributed transaction processing systemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1988
- The cost of messagesPublished by Association for Computing Machinery (ACM) ,1988
- Exploiting virtual synchrony in distributed systemsPublished by Association for Computing Machinery (ACM) ,1987
- On the minimal synchronism needed for distributed consensusJournal of the ACM, 1987
- The S/Net's Linda kernelACM Transactions on Computer Systems, 1986
- Distributed agreement in the presence of processor and communication faultsIEEE Transactions on Software Engineering, 1986
- Impossibility of distributed consensus with one faulty processJournal of the ACM, 1985
- Reliable broadcast protocolsACM Transactions on Computer Systems, 1984
- Implementing remote procedure callsACM Transactions on Computer Systems, 1984
- Guardians and Actions: Linguistic Support for Robust, Distributed ProgramsACM Transactions on Programming Languages and Systems, 1983