TERAPATHS: A QOS-ENABLED COLLABORATIVE DATA SHARING INFRASTRUCTURE FOR PETA-SCALE COMPUTING RESEARCH S. Bradley, F. Burstein, B. Gibbard, D. Katramatos, R. Popescu, D. Stampf, D. Yu, BNL, Upton, NY 11793, USA L. Cottrell, Y. Li, SLAC, Menlo Park, CA 94025, USA S. McKee, University of Michigan, Ann Arbor, MI 48109, USA Abstract TeraPaths, a DOE MICS/SciDAC funded project, de- ployed and prototyped the use of differentiated network- ing services to support the global movement of data in the high energy physics distributed computing environment. While this MPLS/LAN QoS work initially targeted net- working issues specifically at BNL, the experience ac- quired and expertise developed is globally applicable to the ATLAS experiment and the high energy physics community in general. TeraPaths dedicates fractions of the available network bandwidth to ATLAS Tier 1 data movement and limits its disruptive impact on BNL's heavy ion physics program and other more general labo- ratory network needs. We developed a web service-based software system that automates the QoS configuration in LAN paths and negotiates network bandwidth with re- mote network domains on behalf of end users. Our sys- tem architecture can be easily integrated with other net- work management tools to provide a complete end-to-end QoS solution. We demonstrated the effectiveness of TeraPaths in data transfer activities within and/or origi- nating from the Brookhaven National Laboratory. Our continued work focuses on strategically scheduling net- work resources to shorten the transfer time for mission critical data relocation, thus reducing the network error rates, which are proportional to transfer times. Such net- work resources typically span several administrative do- mains and exhibit unique management difficulties. Over- all, our goal remains the provisioning of a robust and ef- fective network infrastructure for high energy and nuclear physics research. INTRODUCTION The extreme demands in networking capacity encoun- tered in the world of modern high energy and nuclear physics make evident the need for the capability to distin- guish between various data flows and enable the network to treat each flow differently. Not all network flows are of equal priority and/or importance, however, the default network behavior is to treat them as such. Thus, the com- petition among flows for network bandwidth can cause severe slow downs for all flows, independent of impor- tance, and furthermore cause some applications to fail. As an example, the Brookhaven National Laboratory (BNL) routinely carries out Relativistic Heavy Ion Col- lider (RHIC) [1] production data transfers and Large Had- ron Collider (LHC) [2] Monte Carlo challenges between the laboratory and several remote collaborators. The ag- gregate of the peak network requirements of such trans- fers is well beyond the capacity of the BNL network. To ensure that RHIC production data transfers are not af- fected, it is necessary to constrain LHC data transfers to opportunistically utilize available bandwidth. The TeraPaths project enables data transfers with speed and reliability guarantees - crucial to applications with deadlines, expectations, and critical decision-making re- quirements - through the use of differentiated networking services. TeraPaths offers the capability to selectively and/or collectively configure network equipment to dedi- cate fractions of the available bandwidth to various data movements and/or replications, thus assuring adequate throughput and limiting the disruptive impact upon each other. This capability is deemed essential for the ATLAS [3] distributed data environment (see Figure 1). Figure 1: ATLAS data distribution. The network managing capabilities enabled by this pro- ject can be integrated into the infrastructure of grid com- puting systems and allow the scheduling of network re- sources along with CPU and storage resources, to enhance the overall performance and efficiency of DOE comput- ing facilities. SYSTEM DESIGN Modern networking hardware offers a range of archi- tectures for providing QoS guarantees to data flows. We chose to design TeraPaths around the DiffServ architec- ture [4] because with this architecture traffic needs to be conditioned (policed/shaped) only at the network bound- ary. DiffServ is thus highly scalable. Up to 64 traffic categories - classes - are supported, using six bits of the Type of Service (ToS) byte, known as DSCP bits. Treat- ment of data is determined on a per-packet basis. In con- trast, the IntServ architecture (RSVP protocol) determines treatment on a per-flow basis and thus requires the main-