The impact of MPI queue usage on message latency

Abstract
It is well known that traditional microbenchmarks do not fully capture the salient architectural features that impact application performance. Even worse, microbenchmarks that target MPI and the communications subsystem do not accurately represent the way that applications use MPI. For example, traditional MPI latency benchmarks time a ping-pong communication with one send and one receive on each of two nodes. The time to post the receive is never counted as part of the latency. This scenario is not even marginally representative of most applications. Two new microbenchmarks are presented here that analyze network latency in a way that more realistically represents the way that MPI is typically used. These benchmarks are used to evaluate modern high-performance networks, including Quadrics, InfiniBand, and Myrinet.

This publication has 11 references indexed in Scilit: