Parallel Run Length Encoding Compression: Reducing I/o in dYnamic Environmental Simulations

Abstract
Dynamic simulations based on time-varying inputs are extremely I/O intensive. This is shown by industrial appli cations generating environmental projections based on seasonal-to-interannual climate forecasts that have a compute to data access ratio of O(n) leading to significant performance degradation. Exploitation of compression techniques such as run length encoding (RLE) signifi cantly reduces the I/O bottleneck and storage require ments. Unfortunately, traditional RLE algorithms do not perform well in a parallel vector platform such as the Cray architecture. This paper describes the design and imple mentation of a new RLE algorithm based on data chunking and packing that exploits the Cray gather-scatter vector hardware and multiple processors. This approach reduces I/O and file storage requirements on average by an order of magnitude. Data intensive applications such as the integration of environmental and global climate models now become practical in a realistic time frame.

This publication has 4 references indexed in Scilit: