Dynamic monitoring of high-performance distributed applications
- 25 June 2003
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operating systems, hosts, and networks. However, one must be very careful to design the instrumentation to have extremely low overhead, and not affect the system being monitored. In this paper we present a very light-weight instrumentation system that can be dynamically activated to unobtrusively collect and aggregate detailed end-to-end monitoring information from distributed applications. We also show how emerging "Web Services" can be used to facilitate remote interaction with this system.Keywords
This publication has 9 references indexed in Scilit:
- Autopilot: adaptive control of distributed applicationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The NetLogger methodology for high performance distributed systems performance analysisPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Grid information services for distributed resource sharingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The Kangaroo approach to data movement on the GridPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Fast heterogeneous binary data interchangePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Using High-Speed WANs and Network Data Caches to Enable Remote and Distributed VisualizationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2000
- The network weather service: a distributed resource performance forecasting service for metacomputingFuture Generation Computer Systems, 1999
- A security architecture for computational gridsPublished by Association for Computing Machinery (ACM) ,1998
- Simple Network Time Protocol (SNTP)Published by RFC Editor ,1995