End-to-end service failure diagnosis using belief networks

Abstract
We present fault localization techniques suitable for diagnosing end-to-end service problems in communication systems with complex topologies. We refine a layered system model that represents relationships between services and functions offered between neighboring protocol layers. In a given layer, an end-to-end service between two hosts may be provided using multiple host-to-host services offered in this layer between two hosts on the end-to-end path. Relationships among end-to-end and host-to-host services form a bipartite probabilistic dependency graph whose structure depends on the network topology in the corresponding protocol layer. When an end-to-end service fails or experiences performance problems it is important to efficiently find the responsible host-to-host services. Finding the most probable explanation (MPE) of the observed symptoms is NP-hard. We propose two fault localization techniques based on Pearl's (1988) iterative algorithms for singly connected belief networks. The probabilistic dependency graph is transformed into a belief network, and then the approximations based on Pearl's algorithms and exact bucket tree elimination algorithm are designed and evaluated through extensive simulation study.

This publication has 20 references indexed in Scilit: