An Adaptive Synchronization Technique for Parallel Simulation of Networked Clusters

Abstract
Computer clusters are a very cost-effective approach for high performance computing, but simulating a complete cluster is still an open research problem. The obvious approach—to parallelize individual node simulators—is complex and slow. Combining individual parallel simulators implies synchronizing their progress of time. This can be accomplished with a variety of parallel discrete event simulation techniques, but unfortunately any straightforward approach introduces a synchronization overhead causing up two orders of magnitude of slowdown with respect to the simulation speed of an individual node. In this paper we present a novel adaptive technique that automatically adjusts the synchronization boundaries. By dynamically relaxing accuracy over the least interesting computational phases we dramatically increase performance with a marginal loss of precision. For example, in the simulation of an 8-node cluster running NAMD (a parallel molecular dynamics application) we show an acceleration factor of 26x over the deterministic "ground truth" simulation, at less than a 1% accuracy error.

This publication has 7 references indexed in Scilit: