Performance Evaluation of I/O Traffic and Placement of I/O Nodes on a High Performance Network Salvador Coll , Fabrizio Petrini , Eitan Frachtenberg and Adolfy Hoisie CCS-3 Modeling, Algorithms, & Informatics Computer & Computational Sciences Division Los Alamos National Laboratory Electronic Engineering Department Technical University of Valencia {scoll,fabrizio,eitanf,hoisie }@lanl.gov Abstract A common trend in the design of large-scale clusters is to use a high-performance data network to integrate the processing nodes in a single parallel computer. In these systems the performance of the interconnect can be a limiting factor for the input/output (I/O), which is traditionally bottlenecked by the disk bandwidth. In this paper we present an experimental analysis on a 64-node AlphaServer cluster based on the Quadrics network (QsNET) of the behavior of the interconnect under I/O traffic, and the influence of the placement of the I/O servers on the overall performance. The effects of using dedicated I/O nodes or overlapping I/O and computation on the I/O nodes are also analyzed. In addition, we evaluate how background I/O traffic interferes with other parallel applications running concurrently. Our experimental resultsshow that a correct placement of the I/O servers can provide up to 20% increase in the available I/O bandwidth. Moreover, some important guidelines for applications and I/O servers mapping on large-scale clusters aregiven. Keywords: Interconnection Networks, Performance Evaluation, User-level Communication, Input/Output. 1 Introduction Scientific applications that run on parallel systems usually require input and output of large amounts of data, therefore the I/O performance can be a potential bottleneck. Collective I/O, in which all processes cooperate to carry out large-scale I/O transactions, has been proposed as a way to improve the I/O performance of such applications. Some techniques currently being used to provide collective I/O facilities are: (1) parallel file systems (HFS for the HP Exemplar [2], PFS for the Intel Paragon [9], PIOFS and GPFS for the IBM SP [5], XFS for the SGI Origin2000 [20], PVFS [3] for Linux clusters), (2) distributed file systems (NFS [21], GFS [17]) and (3) run-time I/O libraries (MPI-IO [8]). Most of these systems assume that the I/O subsystem is homogeneous and the message passing over the network is fast and scalable. Nevertheless, the behavior of the interconnect in such systems can be also a performance limiting factor, although the I/O performance on massively parallel processors has been traditionally limited by disk bandwidth [4]. The efficient integration of the interconnection network with the I/O is a key factor to efficiently exploit the power of high-performance parallel computers. InfiniBand [1] is an emerging standard that provides an integrated view of computing, networking and storage technologies. The InfiniBand architecture is based on a switch interconnect technology with high speed point-to-point links and offers support for Quality of Service (QoS), fault-tolerance, remote direct memory access, etc., and is likely to become the backbone of future commodity parallel computers, I/O servers, and data centers. The Quadrics interconnection network (QsNET) [7] is currently being used in some of the largest parallel systems in the world, typically connecting Compaq Alpha-based servers, but increasingly other compute platforms too 1 . The QsNET provides some innovative design issues very similar to those defined by the InfiniBand specification, which are likely to appear in the commodity market in the next few years. Some of these salient aspects are the integration of the local virtual memory into a distributed virtual shared memory, remote direct memory access, the presence of a programmable processor in the network interface that allows the implementation of intelligent communication protocols, and fault-tolerance. The work was supported by the U.S. Department of Energy through Los Alamos National Laboratory contract W-7405-ENG-36 1 More information on the Quadrics network can be found at http://www.c3.lanl.gov/ ~fabrizio/quadrics.html