UDT as an Alternative Transport Protocol for GridFTP John Bresnahan, 1,2,3 Michael Link, 1,2 Rajkumar Kettimuthu, 1,2 and Ian Foster 1,2,3 1 Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439 2 Computation Institute, University of Chicago, Chicago, IL 60637 3 Department of Computer Science, University of Chicago, Chicago, IL 60637 Abstract GridFTP has emerged as a de facto standard for secure, reliable, high-performance data transfer across resources on the Grid. By default, GridFTP uses TCP as its transport-level communication protocol. It is well known that TCP Reno cannot provide satisfactory performance on high-speed, long-delay networks. In this paper, we describe how we enabled GridFTP to use UDT as an alternative transport-level communication protocol. We compare the performance of GridFTP over UDT with GridFTP over TCP on various test beds. We also study the impact of UDT on bulk TCP flows. 1. Introduction GridFTP [1] has been commonly used as a data transfer protocol in the Grid. The GridFTP protocol extends the standard FTP protocol and provides a superset of the features offered by the various Grid storage systems currently in use. Key features of GridFTP include the following: Security: The Globus GridFTP [2] server/client utilizes the GSI protocol, which not only enables a secure Public Key Infrastructure (PKI) interface but also adds the capability of delegated authority via X.509 certificates. Delegated authority is critical for large collaboration efforts and enables single sign-on in virtual organizations, thereby eliminating the need for the user to enter passwords onto what can be hundreds of different sites. Kerberos is also supported. Parallelism: On wide-area links, using multiple TCP streams in parallel between a single source and destination can improve aggregate bandwidth relative to that achieved by a single stream. GridFTP supports such parallelism via FTP command extensions and data channel extensions. Striping: Additionally, GridFTP supports striped data movement, in which data distributed across, or generated by, a set of computers or storage systems at one end of a network is transferred to another remote set of storage systems or computers. Third-Party Control: GridFTP also allows secure third-party clients to initiate transfers between remote sites, thereby facilitating the management of large datasets for distributed communities. Partial File Transfer: Some applications can benefit from transferring portions of files rather than complete files. GridFTP supports requests for arbitrary file regions. Reliability: GridFTP provides support for reliable and restartable data transfers. Negotiation of TCP buffer/window sizes: GridFTP employs FTP command and data channel extensions to support both automatic and manual negotiation of TCP buffer sizes for large files as well as large sets of small files. The Globus implementation of GridFTP provides a software suite optimized for the gamut of data access issues—from bulk file transfer to the details of getting data out of complex storage systems in sites. Although GridFTP supports multiple TCP streams to overcome the limitations of TCP congestion control algorithm for long, fat networks [3-5], it is still not possible to utilize the available bandwidth optimally in some situations. UDP based Data Transfer protocol (UDT) [6] is a popular application level data transport protocol that addresses the limitations of TCP in fast, long-distance networks. In this paper, we describe the following: - Development of a Globus XIO driver for UDT