Pattern-Direct and Layout-Aware Replication Scheme for Parallel I/O Systems
Yanlong Yin
∗
, Jibing Li
∗
, Jun He
∗
, Xian-He Sun
∗
, and Rajeev Thakur
†
∗
Computer Science Department
Illinois Institute of Technology, Chicago, Illinois 60616
Email: {yyin2, jli33, jhe24, sun}@iit.edu
†
Mathematics and Computer Science Division
Argonne National Laboratory, Argonne, Illinois 60439
Email: thakur@mcs.anl.gov
Abstract—The performance gap between computing power
and the I/O system is ever increasing, and in the meantime
more and more High Performance Computing (HPC) ap-
plications are becoming data intensive. This study describes
an I/O data replication scheme, named Pattern-Direct and
Layout-Aware (PDLA) data replication scheme, to alleviate
this performance gap. The basic idea of PDLA is replicating
identified data access pattern, and saving these reorganized
replications with optimized data layouts based on access cost
analysis. A runtime system is designed and developed to
integrate the PDLA replication scheme and existing parallel
I/O system; a prototype of PDLA is implemented under the
MPICH2 and PVFS2 environments. Experimental results show
that PDLA is effective in improving data access performance
of parallel I/O systems.
Keywords-Parallel I/O; I/O optimization; data replication;
data reorganization; data access pattern
I. I NTRODUCTION
During the last several decades, the rapid development
of semiconductor technology allowed the processor speed
to increase exponentially. Supercomputers are moving from
petascale towards exascale in the coming decade. However,
the developments of the data input/output (I/O) system
and storage devices do not keep pace with that of the
computing power. As believed by many, the trend of the
biased technology advance will continue in the near future.
This unbalanced technology advance leads to the so-called
I/O-wall problem.
In the meantime, large-scale scientific applications grow
continuously in terms of data access intensity, imposing
greater workload on the I/O and storage subsystems. This
trend of applications puts even more pressure on already
saturated I/O systems. For instance, in astronomy, giant radio
telescopes capture observation images continuously, and
then the captured data are saved into storage systems. The
data analysis applications, such as Montage [1] developed
by NASA, then read the data out of storage systems and
analyze them. The telescopes may generate data at a rate of
many gigabytes to even petabytes per second and the data
analysis is both computational intensive and data intensive
[2].
Relatively slow storage devices compounded with data
intensive applications make I/O system the primary perfor-
mance bottleneck in many HPC systems. This drawback mo-
tivates this study, which aims to alleviate the I/O bottleneck,
especially for data intensive applications.
I/O is a hot issue in recent years. Many I/O optimization
techniques have been developed, such as data sieving [3],
List I/O [4], DataType I/O [5], and Collective I/O [3] [6].
Some systems may also integrate new layers/middleware
into the parallel I/O software stack. All these layers and
optimization techniques make the parallel I/O system ex-
ceedingly complex. How to optimize I/O performance is
elusive, and the optimization is a complex, error-prone,
and time-consuming task, especially for applications with
complex I/O behaviors. For example, Zhang’s work [7]
shows that Collective I/O works well in some cases but not
in others. Song’s work [8] shows that finding the optimal
data layout configuration in PVFS2 can be a daunting task.
Their works further confirm our belief that I/O performance
is application dependent, and a general I/O system need
to be adjustable to different applications [9]. This raised a
must have property of our solution: the I/O optimization
should bring the application and system’s characteristics
into consideration and be adaptive for different applications.
To achieve the goal of alleviating I/O bottleneck and to
satisfy the requirement of the I/O optimization’s adaptability,
we design and implement the Pattern-Direct and Layout-
Aware (PDLA) replication scheme for parallel I/O systems.
We design PDLA based on the following facts.
1) Contiguous data access is preferable.
The performance of contiguous data access is higher than
that of noncontiguous data access. This stays true for both
hard disk drives (HDD) and solid state disks (SSD) [10].
2) Data layout matters.
Data layout in parallel file systems can largely influence
the I/O performance. Modern parallel file systems support
multiple data layout policies. Users can choose to distribute
some data on one single storage node, on a set of nodes, or
on all available nodes. The previous work [8] shows that, for
applications with different data access patterns, the optimal
data layouts are different. The optimal data layout yields the
2013 IEEE 27th International Symposium on Parallel & Distributed Processing
1530-2075/13 $26.00 © 2013 IEEE
DOI 10.1109/IPDPS.2013.114
345