An approach to constructing modular fault-tolerant protocols
- 30 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Modularization is a well-known technique for simplify- ing complex software. Here, an approach to modularizing fault-tolerant protocols such as reliable multicast and me m- bership is described. The approach is based on implement- ing a protocol's individual properties as separate micro- protocols, and then combining selected micro-protocols using an event-driven software framework; a system is con- structed by composing these frameworks with traditional network protocols using standard hierarchical techniques . In addition to simplifying the software, this model helps clarify the dependencies among properties of fault-tolera nt protocols, and makes it possible to construct systems that are customized to the specifics of the application or under- lying architecture. An example involving reliable group multicast is given, together with a description of a proto- type implementation using the SR concurrent programming language. An implementation based on the x-kernel and RT-Mach is also underway.Keywords
This publication has 22 references indexed in Scilit:
- Reliable broadcast for fault-tolerance on local computer networksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Real-time data management with clock-less reliable broadcast protocolsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Lightweight causal and atomic group multicastACM Transactions on Computer Systems, 1991
- Understanding transactions in the operating system contextACM SIGOPS Operating Systems Review, 1991
- Fault-Tolerant Membership Service in a Synchronous Distributed Real-Time SystemPublished by Springer Nature ,1991
- The x-Kernel: an architecture for implementing network protocolsIEEE Transactions on Software Engineering, 1991
- Recovery in distributed systems using optimistic message logging and checkpointingJournal of Algorithms, 1990
- A robust group membership algorithm for distributed real-time systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1990
- Checkpointing and Rollback-Recovery for Distributed SystemsIEEE Transactions on Software Engineering, 1987
- Reliable broadcast protocolsACM Transactions on Computer Systems, 1984