A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards School of Natural & Computing Sciences University of Aberdeen Aberdeen, AB24 5UE, Scotland {e.pignotti, p.edwards}@abdn.ac.uk Gary Polhill, Nick Gotts The Macaulay Institute Craigiebuckler Aberdeen, AB15 8QH, UK {g.polhill, n.gotts}@macaulay.ac.uk Alun Preece School of Computer Science Cardiff University Cardiff, CF24 3AA, UK A.D.preece@cs.cf.ac.uk Abstract Workflow technologies provide scientific researchers with a flexible problem-solving environment, by facilitat- ing the creation and execution of experiments from a pool of available services. In this paper we argue that in or- der to better characterise such experiments we need to go beyond low-level service composition and execution details by capturing higher-level descriptions of the scientific pro- cess. Current workflow technologies do not incorporate any representation of such experimental constraints and goals, which we refer to as the scientist’s intent. We have devel- oped a framework based upon use of a number of Seman- tic Web technologies, including the OWL ontology language and the Semantic Web Rule Language (SWRL), to capture scientist’s intent. Through the use of a social simulation case study we illustrate the benefits of using this framework in terms of workflow monitoring, workflow provenance and enrichment of experimental results. 1. Introduction In recent years researchers have become increasingly dependent on scientific resources available through the Internet, including computational modelling services and datasets. This is changing the way in which research is con- ducted with increasing emphasis on ‘in silico’ experiments as a way to test hypotheses. Scientific workflow technolo- gies [22] have emerged in recent years to allow researchers to create and execute experiments given a pool of available services. However, the current generation of technologies can only capture the experimental method and not the asso- ciated constraints and goals, which is essential if such ex- periments are to be truly transparent. Many different workflow languages exist including: MoML (Modelling Markup Language) [14], BPEL (Busi- ness Process Execution Language) [2], Scufl (Simple con- ceptual unified flow language) [21]. A number of tools are available for creating and enacting workflows most notably Taverna [20] and Kepler [15]. Taverna (based on the Scufl language) is a tool developed by the myGrid 1 project to support ‘in silico’ experimentation in biology. It provides an editor tool for the creation of workflows and the facility to locate services from a directory via an ontology-driven search facility. Semantic support in Taverna allows the de- scription of workflow activities but is limited to facilitat- ing the discovery of suitable services during the design of a workflow. Kepler [15] is a workflow tool based on the MoML language; Web and Grid services, Globus Grid jobs, and GridFTP can be used as components in the workflow. Kepler extends the MoML language by introducing the con- cept of a Director, to define execution models and monitor the workflow. These languages and tools are designed to capture the flow of information between services (e.g. service ad- dresses and relations between inputs and outputs). We ar- gue that in order to fully characterise scientific analysis we need to go beyond such low-level descriptions by capturing the experimental conditions. The aim here is to make the constraints and goals of the experiment, which we describe 1 www.mygrid.org.uk