Efficient structured data access in parallel file systems
- 1 January 2003
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 29, 326-335
- https://doi.org/10.1109/clustr.2003.1253331
Abstract
Parallel scientific applications store and retrieve very large, structured datasets. Directly supporting these structured accesses is an important step in providing high-performance I/O solutions for these applications. High-level interfaces such as HDF5 and Parallel netCDF provide convenient APIs for accessing structured datasets, and the MPI-IO interface also supports efficient access to structured data. However, parallel file systems do not traditionally support such access. In this work we present an implementation of structured data access support in the context of the parallel virtual file system (PVFS). We call this support "datatype I/O" because of its similarity to MPI datatypes. This support is built by using a reusable datatype-processing component from the MPICH2 MPI implementation. We describe how this component is leveraged to efficiently process structured data representations resulting from MPI-IO operations. We quantitatively assess the solution using three test applications. We also point to further optimizations in the processing path that could be leveraged for even more efficient operation.Keywords
This publication has 9 references indexed in Scilit:
- Parallel netCDFPublished by Association for Computing Machinery (ACM) ,2003
- Noncontiguous I/O through PVFSPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Noncontiguous I/O accesses through MPI-IOPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear FlashesThe Astrophysical Journal Supplement Series, 2000
- The Implementation of MPI-2 One-Sided Communication for the NEC SX-5Published by Institute of Electrical and Electronics Engineers (IEEE) ,2000
- On implementing MPI-IO portably and with high performancePublished by Association for Computing Machinery (ACM) ,1999
- Data sieving and collective I/O in ROMIOPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- An Extended Two-Phase Method for Accessing Sections of Out-of-Core ArraysScientific Programming, 1996
- Implementation and performance of a parallel file system for high performance distributed applicationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1996