Improving MPI-I/O Performance on PVFS ⋆ Jonathan Ilroy 2 , Cyrille Randriamaro 1 , and Gil Utard 1 1 LaRIA, Universit´ e de Picardie Jules Verne, 80000 Amiens, France 2 Universit´ e de Mons-Hainaut, 7000 Mons, Belgique Abstract. Since the deﬁnition of the MPI-IO, a standard interface for parallel IO, some implementations are available for cluster of worksta- tions. In this paper we focus on the ROMIO implementation (from Ar- gonne Laboratory), running on PVFS. PVFS [5] is a Parallel Virtual File System developed at Clemson University. This ﬁle system uses local ﬁle systems of I/O nodes in a cluster to store data on disks. Data is striped among disks with a stripe parameter. The ROMIO implementation is not aware of the particular data-distribution of PVFS. We show how to improve performances of collective I/O of MPI-IO on such a parallel and distributed ﬁle system: the optimization avoids the data-redistribution induced by the PVFS ﬁle system. We show performance results on typi- cal ﬁle access schemes found in data-parallel applications, and compare to the performances of the original PVFS port. 1 Introduction Many scientiﬁc and engineering problems can involve data sets that are too large to ﬁt in main memory. This kind of application are referred to as “parallel out-of-core ” applications. To solve parallel out-of-core problems, there is a cru- cial need of high performance parallel IO library. The MPI Forum has deﬁned a standard application programming interface for parallel IO called MPI-IO. MPI-IO is part of the MPI-2 speciﬁcation and should be widely used by the community. There are two main implementations of MPI-IO: PMPIO from NAS and ROMIO [6] from Argone National Laboratory. These implementations are available for several ﬁle systems such that the intel PFS or the IBM PIOFS. There are new parallel ﬁle systems for commodity cluster such that BPFS [1], PPFS, ViPIOS, GFS and PVFS [5]. Such ﬁle systems distribute ﬁle across several disks in the cluster, and provide better IO bandwidth and throughput than conventional UFS or NFS. ROMIO is running on the PVFS ﬁle system. In [4], we presented preliminary performance results of ROMIO on PVFS. In this paper, we show how to improve performances of collective I/O of MPI- IO on such a parallel and distributed ﬁle system: the optimization avoids the data-redistribution induced by the PVFS ﬁle system. We show new performance results on a typical ﬁle access scheme found in data-parallel applications, and compare to the performances of the original PVFS port. ⋆ This work is supported by a grant from “Pˆole de Mod´ elisation de la R´ egion Picardie” R. Sakellariou et al. (Eds.): Euro-Par 2001, LNCS 2150, pp. 911–915, 2001. c  Springer-Verlag Berlin Heidelberg 2001