Modeling fault-tolerant mobile agent execution as a sequence of agreement problems

7 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 1477, 11-20
https://doi.org/10.1109/reldi.2000.885388

Abstract

Fault tolerance is fundamental to the further development of mobile agent applications. In the context of mobile agents, fault tolerance prevents a partial or complete loss of the agent, i.e. ensures that the agent arrives at its destination. Simple approaches such as checkpointing are prone to blocking. Replication can in principle improve solutions based on checkpointing. However existing solutions in this context either assume a perfect failure detection mechanism (which is not realistic in an environment such as the Internet), or rely on complex solutions based on leader election and distributed transactions, where only a subset of solutions prevents blocking. The paper proposes a novel approach to fault tolerant mobile agent execution, which is based on modeling agent execution as a sequence of agreement problems. Each agreement problem is one instance of the well understood consensus problem. Our solution does not require a perfect failure detection mechanism, while preventing blocking and ensuring that the agent is executed exactly once.

Keywords

This publication has 8 references indexed in Scilit:

Reliable communication for highly mobile agents
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Simulating fail-stop in asynchronous distributed systems
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A fault-tolerant protocol for providing the exactly-once property of mobile agents
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Semi-passive replication
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
An approach for providing mobile agent fault tolerance
Published by Springer Nature ,1998
Towards fault-tolerant and secure agentry
Published by Springer Nature ,1997
Unreliable failure detectors for reliable distributed systems
Journal of the ACM, 1996
Revisiting the relationship between non-blocking atomic commitment and consensus
Published by Springer Nature ,1995