Probabilistic Fault Localization in Communication Systems Using Belief Networks
- 18 October 2004
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE/ACM Transactions on Networking
- Vol. 12 (5) , 809-822
- https://doi.org/10.1109/tnet.2004.836121
Abstract
We apply Bayesian reasoning techniques to perform fault localization in complex communication systems while using dynamic, ambiguous, uncertain, or incorrect information about the system structure and state. We introduce adaptations of two Bayesian reasoning techniques for polytrees, iterative belief updating, and iterative most probable explanation. We show that these approximate schemes can be applied to belief networks of arbitrary shape and overcome the inherent exponential complexity associated with exact Bayesian reasoning. We show through simulation that our approximate schemes are almost optimally accurate, can identify multiple simultaneous faults in an event driven manner, and incorporate both positive and negative information into the reasoning process. We show that fault localization through iterative belief updating is resilient to noise in the observed symptoms and prove that Bayesian reasoning can now be used in practice to provide effective fault localization.Keywords
This publication has 36 references indexed in Scilit:
- Composite events for network event correlationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Topology discovery for large ethernet networksPublished by Association for Computing Machinery (ACM) ,2001
- Beacon: A Hierarchical Network Topology Monitoring System Based on IP MulticastPublished by Springer Nature ,2000
- Auto-Discovery Capabilities for Service Management: An ISP Case StudyJournal of Network and Systems Management, 2000
- Turbo decoding as an instance of Pearl's "belief propagation" algorithmIEEE Journal on Selected Areas in Communications, 1998
- A modeling framework for integrated distributed systems fault managementPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1996
- Schemes for fault identification in communication networksIEEE/ACM Transactions on Networking, 1995
- Event Correlation using Rule and Object Based TechniquesPublished by Springer Nature ,1995
- A Coding Approach to Event CorrelationPublished by Springer Nature ,1995
- Management Information Base for Network Management of TCP/IP-based internets:MIB-IIPublished by RFC Editor ,1991