Improving MPI-I/O Performance on PVFS ⋆ Jonathan Ilroy 2 , Cyrille Randriamaro 1 , and Gil Utard 1 1 LaRIA, Universit´ e de Picardie Jules Verne, 80000 Amiens, France 2 Universit´ e de Mons-Hainaut, 7000 Mons, Belgique Abstract. Since the definition of the MPI-IO, a standard interface for parallel IO, some implementations are available for cluster of worksta- tions. In this paper we focus on the ROMIO implementation (from Ar- gonne Laboratory), running on PVFS. PVFS [5] is a Parallel Virtual File System developed at Clemson University. This file system uses local file systems of I/O nodes in a cluster to store data on disks. Data is striped among disks with a stripe parameter. The ROMIO implementation is not aware of the particular data-distribution of PVFS. We show how to improve performances of collective I/O of MPI-IO on such a parallel and distributed file system: the optimization avoids the data-redistribution induced by the PVFS file system. We show performance results on typi- cal file access schemes found in data-parallel applications, and compare to the performances of the original PVFS port. 1 Introduction Many scientific and engineering problems can involve data sets that are too large to fit in main memory. This kind of application are referred to as “parallel out-of-core ” applications. To solve parallel out-of-core problems, there is a cru- cial need of high performance parallel IO library. The MPI Forum has defined a standard application programming interface for parallel IO called MPI-IO. MPI-IO is part of the MPI-2 specification and should be widely used by the community. There are two main implementations of MPI-IO: PMPIO from NAS and ROMIO [6] from Argone National Laboratory. These implementations are available for several file systems such that the intel PFS or the IBM PIOFS. There are new parallel file systems for commodity cluster such that BPFS [1], PPFS, ViPIOS, GFS and PVFS [5]. Such file systems distribute file across several disks in the cluster, and provide better IO bandwidth and throughput than conventional UFS or NFS. ROMIO is running on the PVFS file system. In [4], we presented preliminary performance results of ROMIO on PVFS. In this paper, we show how to improve performances of collective I/O of MPI- IO on such a parallel and distributed file system: the optimization avoids the data-redistribution induced by the PVFS file system. We show new performance results on a typical file access scheme found in data-parallel applications, and compare to the performances of the original PVFS port. ⋆ This work is supported by a grant from “Pˆole de Mod´ elisation de la R´ egion Picardie” R. Sakellariou et al. (Eds.): Euro-Par 2001, LNCS 2150, pp. 911–915, 2001. c Springer-Verlag Berlin Heidelberg 2001