The File Mover: An Efficient Data Transfer System for Grid Applications Cosimo Anglano, Massimo Canonico Dipartimento di Informatica Universit` a del Piemonte Orientale, Alessandria (Italy) email: cosimo.anglano,massimo.canonico @unipmn.it Abstract In this paper we present the File Mover, a data transfer system designed to optimize the transfer of potentially very large files. The File Mover relies on an overlay network ar- chitecture, where a set of machines cooperate in the transfer by forwarding among them portions of the files being trans- ferred. Data transfer times are minimized by choosing, for each transfer, the set of relays that maximize the expected throughput. Preliminary experiments show that the File Mover is able to profitably exploit existing network paths not chosen by IP routing algorithms, thereby enhancing file transfer performance. 1 Introduction The scientific exploration in many disciplines, like High Energy Physics, Climate Modeling, and Life Sciences, re- quires the processing of massive data collections, whose size is in the order of Terabytes (and sometimes even Petabytes) [22], For these applications, the creation of Data Grids[8], that pool geographically distributed storage and computing resources, seems a promising solution. In or- der to enable the achievement of satisfactory performance, Data Grids require the availability of a system able to trans- fer potentially huge files in the shortest possible amount of time [9]. As a matter of fact, the completion time of typi- cal Data Grid applications is given by the sum of their ex- ecution time and of the time taken to transfer the data they need [26], and is often dominated by the data transfer time. To the best of our knowledge, all the file transfer systems we are aware of [2, 7, 15, 16, 30, 32] rely on transport-level protocols (namely TCP and UDP) to move data from their source to their destination. Although these systems exploit highly sophisticated techniques to increase the transfer rate, they suffer from a common drawback, namely their reliance on the IP routing protocols, that has as consequence the fact the throughput they achieve is limited by the bandwidth available on the network path chosen by the IP routing layer. Unfortunately, the IP routing protocols notoriously produce suboptimal routes [12, 13, 29], since their choice of network paths is not guided by performance considerations as they are primarily concerned with the exchange of connectivity information. As a consequence, it is not infrequent the case that shorter transfer times might be obtained by choosing a network path different from that chosen by the IP rout- ing algorithms. For instance, Savage et al. observe [13] that for 30 to 80 percent of the network paths chosen by the IP routing algorithms between pairs of Internet hosts, taken from a relatively large set of machines, it was possible to find alternative paths with better performance characteris- tics. Another consequence of the reliance on the IP routing algorithms is the possibly high amount of time required to recover from link failures. As a matter of fact, the fault re- covery mechanisms used by typical Internet routing proto- cols sometimes take many minutes to converge to a consis- tent form [19], and there are times when path outages lead to significant disruptions in communication lasting even tens of minutes or more [10, 23, 24]. As a consequence, trans- fers along faulty paths may be delayed for a very long time, so the effective bandwidth obtained in these situations drops below any reasonable value. In this paper we describe the File Mover, a software system that addresses the above problems by exploiting an overlay network architecture. An overlay network is a virtual network, layered on top of the existing Internet, whose member nodes are placed at the edges of the un- derlying physical network and communicate by means of a transport-level protocol (e.g, TCP or UDP). Each pair of member nodes of an overlay network (henceforth referred to as Relays) communicate by means of a virtual link, that corresponds to the network path chosen by the IP layer to transfer data from one member to the other one. The relays of an overlay network agree to forward each other’s traffic along one or more virtual links, until the destination host is reached. Figure 1 schematically depicts a possible con- figuration of the File Mover in which an overlay network comprising four relays (A,B,C, and D) is assumed. Note that virtual links are unidirectional, as IP routing is in gen- 1