Simple Parallel Statistical Computing in R

Abstract
Theoretically, many modern statistical procedures are trivial to parallelize. However, practical deployment of a parallelized implementation which is robust and reliably runs on different computational cluster configurations and environments is far from trivial. We present a framework for the R statistical computing language that provides a simple yet powerful programming interface to a computational cluster of CPUs. This interface allows the rapid development of R functions that distribute independent computations across the nodes of the computational cluster. The approach can be extended to finer grain parallelization if needed. The resulting framework allows statisticians to obtain significant speed-ups for some computations at little additional development cost. The particular implementation can be deployed in ad-hoc heterogeneous computing environments.

This publication has 9 references indexed in Scilit: