A Knowledge-Based Approach to Interactive Workflow Composition Jihie Kim, Yolanda Gil, Marc Spraragen University of Southern California/Information Sciences Institute Marina del Rey, CA 90292 USA +1 310 448 8769 {jihie, gil, marcs}@isi.edu ABSTRACT Complex applications in many areas, including scientific computations and business-related web services, are created from collections of components to form computational workflows. In many cases end users have requirements and preferences that depend on how the workflow unfolds, and that cannot be specified beforehand. Workflow editors therefore need to be augmented with intelligent assistance in order to help users in several key aspects of the task, namely: 1) keeping track of detailed constraints across selected components and their connections; 2) accommodating flexibly different strategies to construct workflows; e.g., from general knowledge of necessary tasks, from desired results, or from available data; and 3) taking partial or incomplete descriptions of workflows and understanding the steps needed for their completion. We have developed a system called CAT (Composition Analysis Tool) that analyzes workflows and generates error messages and suggestions in order to help users compose complete and consistent workflows. Our approach combines knowledge bases, which have rich representations of components and constraints, together with planning techniques that can track the relations and constraints among individual components. We have formalized our approach based on AI planning principles, allowing us to formulate claims about the underlying algorithms as well as the resulting workflows. Keywords workflow composition, description logic, interactive approach INTRODUCTION Composing computational workflows is essential in many areas, including scientific computing and business applications. For example, scientists have growing needs to dynamically produce computation workflows where they assemble and link various models that address different aspects of the phenomenon under study [Griphyn 03, SCEC 03, Geodise 03, MyGrid 03]. In business applications, web services are becoming a promising framework for composing new applications out of existing software components (such as software modules or web services). Some planning approaches have been used in this context. [Lansky et al 95, Chien et al 96] However, automatic planning approaches are not always appropriate to generate these workflows. In some cases, users may not have explicit descriptions of the desired end results or goals in the beginning. Users may only have high-level or partial/incomplete descriptions of the desired outcome or the initial state, and the real goals and initial data input may become clear as they see the features of the components that can be used. Business agreements and past experiences of how the components work may also affect the development of the workflow. The goal of our work is to develop interactive tools for composing workflows where users select and configure components and the system assists the users by detecting errors and providing intelligent suggestions to solve them. This provides a complementary capability needed for developing computational workflows. This paper presents an approach to interactive workflow construction. Novel features of the approach include: 1) combining description logic and planning frameworks; 2) interactive workflow construction through planning techniques; 3) properties for verification of manually created workflows based on planning techniques; and 4) an algorithm that assists users by enforcing those properties and suggesting possible next steps. Using this approach, we have developed the CAT system to analyze a partial workflow composed by the user, notify the user of issues to be resolved in the current workflow, and suggest to the user what actions could be taken next. In this paper we focus on the planning techniques that are used in our framework. The details on our system’s interfaces and user interactions are described in [Kim et al 04]. The paper begins by describing our motivations and goals based on a scientific application (earthquake simulation) which led us to develop CAT. We outline the representation that we have developed to describe workflow components. We define desirable properties of workflows, and the composition actions that the user can employ to refine the workflow. We then present the