A data management approach for handling large compressed arrays in high performance computing
- 19 November 2002
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 119-128
- https://doi.org/10.1109/fmpc.1995.380456
Abstract
Poor parallel i/o performance has recently been recognized as a roadblock to scalability of parallel architectures, algorithms, and data sets. For i/o of large arrays, the storage of arrays by subarray divisions-chunking-has been shown to improve i/o performance substantially in many circumstances, In this paper we show how to increase the performance advantages of chunking by combining it with data compression, and describe the results of experiments with compressed chunks from scientific data sets on the Intel iPSC/860. For a particular fixed array size and compression ratio, uncompressed chunk i/o is faster than compressed chunk i/o when the number of processors is small; the reverse holds when the number of processors is large, as the cost of compression as spread over a larger number of processors. With good compression ratios and large numbers of processors, we obtained an effective logical i/o rate for compressed chunks that exceeds the theoretical possible maximum for uncompressed data, by adding compression to an existing chunked i/o library. Our results suggest that compression may be a good technique for handling sparse arrays in parallel i/o.Keywords
This publication has 11 references indexed in Scilit:
- Concurrent file operations in a high performance FORTRANPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Efficient organization of large multidimensional arraysPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- An efficient abstract interface for multidimensional array I/OPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Physical schemas for large multidimensional arrays in scientific computing applicationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Disk-directed I/O for MIMD multiprocessorsACM Transactions on Computer Systems, 1997
- Extensible file system (ELFS)Published by Association for Computing Machinery (ACM) ,1994
- Design and Evaluation of primitives for Parallel I/OPublished by Association for Computing Machinery (ACM) ,1993
- Compiling Fortran D for MIMD distributed-memory machinesCommunications of the ACM, 1992
- Performance measurement of a parallel Input/Output system for the Intel iPSC/2 HypercubePublished by Association for Computing Machinery (ACM) ,1991
- A Technique for High-Performance Data CompressionComputer, 1984