Semi-automated discovery of application session structure
- 25 October 2006
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 119-132
- https://doi.org/10.1145/1177080.1177096
Abstract
While the problem of analyzing network traffic at the granularity of individual connections has seen considerable previous work and tool development, understanding traffic at a higher level - the structure of user-initiated sessions comprised of groups of related connections - remains much less explored. Some types of session structure, such as the coupling between an FTP control connection and the data connections it spawns, have prespecified forms, though the specifications do not guarantee how the forms appear in practice. Other types of sessions, such as a user reading email with a browser, only manifest empirically. Still other sessions might exist without us even knowing of their presence, such as a botnet zombie receiving instructions from its master and proceeding in turn to carry them out. We present algorithms rooted in the statistics of Poisson processes that can mine a large corpus of network connection logs to extract the apparent structure of application sessions embedded in the connections. Our methods are semi-automated in that we aim to present an analyst with high-quality information (expressed as regular expressions) reflecting different possible abstractions of an application's session structure. We develop and test our methods using traces from a large Internet site, finding diversity in the number of applications that manifest, their different session structures, and the presence of abnormal behavior. Our work has applications to traffic characterization and monitoring, source models for synthesizing network traffic, and anomaly detection.Keywords
This publication has 25 references indexed in Scilit:
- Unexpected means of protocol inferencePublished by Association for Computing Machinery (ACM) ,2006
- Internet traffic classification using bayesian analysis techniquesPublished by Association for Computing Machinery (ACM) ,2005
- Detection of Interactive Stepping Stones: Algorithms and Confidence BoundsPublished by Springer Nature ,2004
- A compound model for TCP connection arrivals for LAN and WAN applicationsComputer Networks, 2002
- Difficulties in simulating the InternetIEEE/ACM Transactions on Networking, 2001
- Wide area traffic: the failure of Poisson modelingIEEE/ACM Transactions on Networking, 1995
- A parameterizable methodology for Internet traffic flow profilingIEEE Journal on Selected Areas in Communications, 1995
- On the self-similar nature of Ethernet traffic (extended version)IEEE/ACM Transactions on Networking, 1994
- Empirically derived analytic models of wide-area TCP connectionsIEEE/ACM Transactions on Networking, 1994
- Local area network characteristics, with implications for broadband network congestion managementIEEE Journal on Selected Areas in Communications, 1991