Monitoring program behaviour on SUPRENUM
- 1 April 1992
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGARCH Computer Architecture News
- Vol. 20 (2) , 332-341
- https://doi.org/10.1145/146628.140394
Abstract
It is often very difficult for programmers of parallel computers to understand how their parallel programs behave at execution time, because there is not enough insight into the interactions between concurrent activities in the parallel machine. Programmers do not only wish to obtain statistical information that can be supplied by profiling, for example. They need to have detailed knowledge about the functional behaviour of their programs. Considering performance aspects, they need timing information as well. Monitoring is a technique well suited to obtain information about both functional behaviour and timing. Global time information is essential for determining the chronological order of events on different nodes of a multiprocessor or of a distributed system, and for determining the duration of time intervals between events from different nodes. A major problem on multiprocessors is the absence of a global clock with high resolution. This problem can be overcome if a monitor system capable of supplying globally valid time stamps is used. In this paper, the behaviour and performance of a parallel program on the SUPRENUM multiprocessor is studied. The method used for gaining insight into the runtime behaviour of a parallel program is hybrid monitoring, a technique that combines advantages of both software monitoring and hardware monitoring. A novel interface makes it possible to measure program activities on SUPRENUM. The SUPRENUM system and the ZM4 hardware monitor are briefly described. The example program under study is a parallel ray tracer. We show that hybrid monitoring is an excellent method to provide programmers with valuable information for debugging and tuning of parallel programs.Keywords
This publication has 7 references indexed in Scilit:
- Parallel conjugate gradient algorithms for solving the Neutron Diffusion Equation on SUPERNUMPublished by Association for Computing Machinery (ACM) ,1991
- Integrating performance data collection, analysis, and visualizationPublished by Association for Computing Machinery (ACM) ,1990
- Multiprocessor performance-measurement instrumentationComputer, 1990
- IPS-2: the second generation of a parallel program measurement systemIEEE Transactions on Parallel and Distributed Systems, 1990
- A noninvasive architecture to monitor real-time distributed systemsComputer, 1990
- Performance-measurement tools in a multiprocessor environmentIEEE Transactions on Computers, 1989
- An improved illumination model for shaded displayCommunications of the ACM, 1980