Workflow-Driven Portals for the Integration of Legacy Applications Andrew Harrison Cardiff School of Computer Science Cardiff University Cardiff, CF24 3AA, UK A.B.Harrison@cs.cardiff.ac.uk Ian Kelley†‡, Ian Taylor†‡ Center for Computation and Technology Louisiana State University Baton Rouge, LA, 70803, USA {I.R.Kelley, Ian.J.Taylor}@cs.cardiff.ac.uk Abstract There is a whole class of applications that would bene- fit from advances in Grid technology, but for various rea- sons cannot be easily modified to run natively in a Grid- aware environment. To enable these applications to reap the benefits of the Grid without requiring fundamental modifi- cations, new techniques are being developed for wrapping codes with a light-weight, but Grid-aware, container. By combining workflow techniques for the management and chaining of application logic with the Grid’s service ar- chitecture, complex business logic can be introduced which greatly enhances an application’s scope and usefulness in today’s distributed Grid environments. Exposing this func- tionality through Grid Portal technology further simplifies both the process of managing workflows and the user expe- rience. In this paper, we will introduce how we propose to combine legacy code integration with advanced workflow mechanisms and portal technologies to provide scientific application developers with new and useful mechanisms for leveraging legacy code in new distributed environments. 1. Introduction Legacy-application support within Grid environments is a fundamental requirement. The term legacy in this context can be seen as being analogous to a Grid unaware appli- cation. Given that there exist a plethora of complex scien- tific FORTRAN, C and C++ codes or applications which scientist’s trust and regularly use to conduct their science, the Grid must be able to support these applications. Many of these codes or applications are black-box systems that are interfaced through file input and output. Iterating over time- or space-dependent data and having the flexibility to run on the Grid would be beneficial; however, control of the Grid environment is not needed. Such codes would not need to be modified and become “Grid aware,” but could benefit from being exposed within Grid environments, i.e., as Web or WS-RF Services, to enable their secure use by participants within a collaboration. Other legacy applications, however, can benefit from Grid integration, especially those that expose an environ- ment that a programmer uses to create and execute sub- components or applications. Examples of these types of systems include workflow- or component- based systems, such as Triana [26], Kepler [5] and Cactus [15], which now form part of the so-called Grid Computing Environ- ments (GCE). Other GCEs designed directly with the Grid in mind include VDS [12] and Grid Portals, e.g., Grid- Sphere [22], GENIUS [2] and hosting environments such as the Globus Toolkit. GCEs can interact with the underlying Grid middleware or services directly or can use simplified application-level Grid interfaces, such as the GAT [4], or interfaces undergoing standardization, such as SAGA [25]. The latter approach has been adopted by some of these sys- tems to provide an insulation layer between their require- ments and the underlying and evolving Grid middleware services. Conventionally, portal GCEs have been used to access Grid services and provide the one-point access to the Grid environment. The advantages of this approach are clear through the re-use of standard Web interfaces in order to expose capabilities which offer a zero cost or installation procedure for end-users. Other GCE systems (e.g. Tri- ana, Kepler [21], Taverna [23]) have taken a stand-alone application-based approach and although they require pre- installation or configuration before use, they can offer a more sophisticated interaction with deployable components and their inter-dependencies through interactive workflow services or similar methods. In addition, some other ap- proaches have taken this one step further by exposing these GCE workflow applications within portal Web interfaces not only to combine the advantages of the two approaches, but also to allow legacy components within the workflow to interfaced at a basic level [18].