Modeling and Tracking of Transaction Flow Dynamics for Fault Detection in Complex Systems
- 13 November 2006
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Dependable and Secure Computing
- Vol. 3 (4) , 312-326
- https://doi.org/10.1109/tdsc.2006.52
Abstract
With the prevalence of Internet services and the increase of their complexity, there is a growing need to improve their operational reliability and availability. While a large amount of monitoring data can be collected from systems for fault analysis, it is hard to correlate this data effectively across distributed systems and observation time. In this paper, we analyze the mass characteristics of user requests and propose a novel approach to model and track transaction flow dynamics for fault detection in complex information systems. We measure the flow intensity at multiple checkpoints inside the system and apply system identification methods to model transaction flow dynamics between these measurements. With the learned analytical models, a model-based fault detection and isolation method is applied to track the flow dynamics in real time for fault detection. We also propose an algorithm to automatically search and validate the dynamic relationship between randomly selected monitoring points. Our algorithm enables systems to have self-cognition capability for system management. Our approach is tested in a real system with a list of injected faults. Experimental results demonstrate the effectiveness of our approach and algorithmsKeywords
This publication has 12 references indexed in Scilit:
- Predictive and nonpredictive minimum description length principlesPublished by Springer Nature ,2006
- Capturing, indexing, clustering, and retrieving system historyPublished by Association for Computing Machinery (ACM) ,2005
- Capturing, indexing, clustering, and retrieving system historyACM SIGOPS Operating Systems Review, 2005
- Detecting Application-Level Failures in Component-Based Internet ServicesIEEE Transactions on Neural Networks, 2005
- Ensembles of Models for Automated Diagnosis of System Performance ProblemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- An Introduction to Support Vector Machines and Other Kernel-based Learning MethodsPublished by Cambridge University Press (CUP) ,2000
- Trends in the application of model-based fault detection and diagnosis of technical processesControl Engineering Practice, 1997
- High speed and robust event correlationIEEE Communications Magazine, 1996
- Mixture Densities, Maximum Likelihood and the EM AlgorithmSIAM Review, 1984
- A New Approach to Linear Filtering and Prediction ProblemsJournal of Basic Engineering, 1960