Eliminating The Middleman: Peer-to-Peer Dataflow Adam Barker National e-Science Centre University of Edinburgh a.d.barker@ed.ac.uk Jon B. Weissman University of Minnesota, Minneapolis, MN, USA. jon@cs.umn.edu Jano van Hemert National e-Science Centre University of Edinburgh j.vanhemert@ed.ac.uk ABSTRACT Efficiently executing large-scale, data-intensive workflows such as Montage must take into account the volume and pattern of communication. When orchestrating data-centric work- flows, centralised servers common to standard workflow sys- tems can become a bottleneck to performance. However, standards-based workflow systems that rely on centralisa- tion, e.g., Web service based frameworks, have many other benefits such as a wide user base and sustained support. This paper presents and evaluates a light-weight hybrid architecture which maintains the robustness and simplicity of centralised orchestration, but facilitates choreography by allowing services to exchange data directly with one another. Furthermore our architecture is standards compliment, flex- ible and is a non-disruptive solution; service definitions do not have to be altered prior to enactment. Our architecture could be realised within any existing workflow framework, in this paper, we focus on a Web service based framework. Taking inspiration from Montage, a number of common workflow patterns (sequence, fan-in and fan-out), input to output data size relationships and network configurations are identified and evaluated. The performance analysis con- cludes that a substantial reduction in communication over- head results in a 2–4 fold performance benefit across all pat- terns. An end-to-end pattern through the Montage workflow results in an 8 fold performance benefit and demonstrates how the advantage of using our hybrid architecture increases as the complexity of a workflow grows. Categories and Subject Descriptors C.2.4 [Computer-Communication Networks]: Distributed Systems; C.4 [Performance of Systems]; D.2.11 [Software Engineering]: Software Architectures General Terms Design, Performance. Keywords Decentralised orchestration, workflow optimisation. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. HPDC’08, June 23–27, 2008, Boston, Massachusetts, USA. Copyright 2008 ACM 978-1-59593-997-5/08/06 ...$5.00. 1. INTRODUCTION Efficiently executing large-scale, data-intensive workflows common to scientific applications must take into account the volume and pattern of communication. For example, in Montage [7] an all-sky mosaic computation can require be- tween 2–8 TB of data movement. Standard workflow tools based on a centralised enactment engine, such as Taverna [19] and OMII BPEL Designer [18] can easily become a per- formance bottleneck for such applications, extra copies of the data (intermediate data ) are sent that consume network bandwidth and overwhelm the central engine. Instead, a so- lution is desired that permits data output from one stage to be forwarded directly to where it is needed at the next stage in the workflow. It is certainly possible to develop an op- timised workflow system from scratch that implements this kind of optimisation. In contrast workflow systems based on concrete industrial standards offer a different set of benefits: they have a much larger and wider user base, which allows the leverage of a greater availability of supported tools and application components. This paper explores the extent to which the benefits of each approach can be realised. Can a standards-based workflow system achieve the performance optimisations of custom systems and what are the trade- offs? 1.1 Orchestration and Choreography There are two common architectural approaches to imple- menting workflow; service orchestration and service chore- ography. Service orchestration describes how services can interact at the message level, with an explicit definition of the control flow and data flow. Orchestrations can span mul- tiple applications and/or organisations, and services them- selves have no knowledge of their involvement in a higher level application. A central process always acts as a con- troller to the involved services, both control and data flow messages pass through this centralised server. The Business Process Execution Language (BPEL) [15] is the current de- facto standard way of orchestrating Web services. Service choreography on the other hand is more collabo- rative in nature. A choreography model describes a peer-to- peer collaboration between a collection of services in order to achieve a common goal. Choreography focuses on message exchange, all involved services are aware of their partners and when to invoke operations. The Web services Chore- ography Description Language (WS-CDL) [9] is an XML- based language proposed for choreography. Currently this language is in the W3C candidate recommendation stage and there are no concrete implementations.