The concept of "stability" in asynchronous distributed decision-making systems
- 1 January 2000
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
- Vol. 30 (4) , 549-561
- https://doi.org/10.1109/3477.865172
Abstract
Asynchronous distributed decision-making (ADDM) systems constitute a special class of distributed problems and are characterized as large, complex systems wherein the principal elements are the geographically dispersed entities that communicate among themselves, asynchronously, through message passing and are permitted autonomy in local decision making. Such systems generally offer significant advantages over the traditional, centralized algorithms in the form of concurrency, scalability, high throughput, efficiency, low vulnerability to catastrophic failures, and robustness. A fundamental property of ADDM systems is stability that refers to their behavior under representative perturbations to their operating environments, given that such systems are intended to be real, complex, and to some extent, mission-critical, and are subject to unexpected changes in their operating conditions. This paper introduces the concept of stability in ADDM systems and proposes an intuitive yet practical and usable definition that is inspired by those used in control systems and physics. An ADDM system is defined as a stable system if it returns to a steady state in finite time, following perturbation, provided that it is initiated in a steady state. Equilibrium or steady state is defined through placing bounds on the measured error in the system. Where the final steady state is equivalent to the initial one, a system is referred to as strongly stable. If the final steady state is potentially worse then the initial one, a system is deemed marginally stable. When a system fails to return to steady state following the perturbation, it is unstable. The perturbations are classified as either changes in the input pattern or changes in one or more environmental characteristics of the system, such as hardware failures. For a given ADDM system, the definitions are based on the performance indices that must be judiciously identified by the system architect and are likely to be unique. To facilitate the understanding of stability in representative real-world systems, this paper reports the analysis of two basic manifestations of ADDM systems that have been reported in the literature: (1) a decentralized military command and control problem, MFAD and (2) a novel distributed algorithm with soft reservation for efficient scheduling and congestion mitigation in railway networks, RYNSORD. Stability analysis of MFAD and RYNSORD yields key stable and unstable conditions. A system determined to be stable provides the reassurance that the system will perform well under adverse conditions. In contrast, a system deemed unstable reflects the need to address key weaknesses in the system design. Thus, stability analysis is a necessary and critical step in the development of any ADDM system.Keywords
This publication has 20 references indexed in Scilit:
- Distributed execution model for self-stabilizing systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Distributed program checking: a paradigm for building self-stabilizing distributed protocolsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Notes on real-time distributed database systems stabilityPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Hybrid systems as Finsler manifolds: Finite state control as approximation to connectionsPublished by Springer Nature ,1995
- Testing and debugging distributed programs using global predicatesIEEE Transactions on Software Engineering, 1995
- Effects of response and stability on scheduling in distributed computing systemsIEEE Transactions on Software Engineering, 1988
- On the costs of self-stabilizationInformation Processing Letters, 1987
- Stability and Distributed Scheduling AlgorithmsIEEE Transactions on Software Engineering, 1985
- Distributed snapshotsACM Transactions on Computer Systems, 1985
- Stability in Nonlinear Control SystemsPublished by Walter de Gruyter GmbH ,1961