Abstract
Observing the activities of a complex parallel computer system is no small feat, and relating these observations to program behavior is even harder. This paper presents a general measurement approach that is applicable to a large class of scalable programs and machines, specifically data parallel programs executing on distributed memory computer systems. The combined instrumentation and visualization paradigm, called VISTA (which stands for Visualization and Instrumentation of Scalable mulTicomputer Applications), is based on the author's experiences of programming and monitoring applications running on an nCUBE 2 computer and a MasPar MP-1 computer. The key is that performance data are treated similarly to any distributed data in the context of the data parallel programming model. Because of the data-parallel mapping of the program onto the machine, one can view the performance as it relates to each processor, processor cluster or processor ensemble and as it relates to the data structures of the program. The author illustrates the utility of VISTA by way of an example.

This publication has 8 references indexed in Scilit: