Evaluating I/O Characteristics and Methods for Storing Structured Scientific Data Avery Ching 1 , Alok Choudhary 1 , Wei-keng Liao 1 , Lee Ward 2 , and Neil Pundit 2 1 Northwestern University 2 Sandia National Laboratories Department of EECS Scalable Computer Systems Department Evanston, IL 60208-3118 USA Albuquerque, NM 87185-1110 USA {aching, choudhar, wkliao}@ece.northwestern.edu {lee, pundit}@sandia.gov Abstract Many large-scale scientific simulations generate large, structured multi-dimensional datasets. Data is stored at various intervals on high performance I/O storage systems for checkpointing, post-processing, and visualization. Data storage is very I/O intensive and can dominate the overall running time of an applica- tion, depending on the characteristics of the I/O access pattern. Our NCIO benchmark determines how I/O characteristics greatly affect performance (up to 2 or- ders of magnitude) and provides scientific application developers with guidelines for improvement. In this pa- per, we examine the impact of various I/O parameters and methods when using the MPI-IO interface to store structured scientific data in an optimized parallel file system. 1. Introduction There is a class of scientific simulations that com- pute on large, structured multi-dimensional datasets which must be stored at numerous time steps. Data storage is necessary for visualization, snapshots, check- pointing, out-of-core computation, post processing [11], and numerous other reasons. Integrated Parallel Ac- curate Reservoir Simulation (IPARS) [12] is one such example. IPARS, a software system for large-scale oil reservoir simulation, computes on a three-dimensional data grid at every time step with 9,000 cells. Each cell in the grid is responsible for 17 variables. A total of 10,000 time steps will generate approximately 6.9 GBytes of data. Another example of scientific com- puting on large structured datasets is the Advanced Simulation and Computing (ASC) FLASH code. The ASC FLASH code [7], an adaptive mesh refinement application that solves fully compressible, reactive hy- drodynamics equations for studying nuclear flashes on neutron stars and white dwarfs, stores 24 variables per cell in a three-dimensional data grid. Storing time step data in structured data grids for applications like IPARS and ASC FLASH requires using an I/O ac- cess pattern, which contains both a memory descrip- tion and a matching file description. When individual variables are computed and stored, the memory and file descriptions generated for the resulting I/O access pat- tern may have contiguous regions as small as a double (usually 8 bytes). Numerous studies have shown that the noncontiguous I/O access patterns evident in ap- plications as IPARS and FLASH are common to most scientific applications [1, 6]. Most scientific applica- tions use MPI-IO natively or through higher level I/O libraries such as pNetCDF [9] or HDF5 [8]. Cluster computing has been rapidly growing as the leading hardware platform for large-scale scientific sim- ulation due to cost-effectiveness and scalability. While I/O has traditionally been a bottleneck in PCs, at- tempting to service the I/O requirements of an entire cluster has only augmented this problem to a much larger scale. Parallel file systems have helped to attain better I/O performance for clusters by striping data across multiple disks and are commonly used in most large-scale clusters [10, 5, 13, 2, 16]. In this paper, we generalize noncontiguous I/O ac- cess for storing scientific data in a modern parallel file system and evaluate the effects of varying three I/O characteristics: region count, region size and region spacing. We created the noncontiguous I/O bench- mark, NCIO, to help application designers optimize their I/O algorithms. NCIO tests various I/O meth- ods (POSIX I/O, list I/O, two phase I/O, and datatype 1-4244-0054-6/06/$20.00 ©2006 IEEE