A High-Level Distributed Execution Framework for Scientific Workflows
Jianwu Wang
1
, Ilkay Altintas
1
, Chad Berkley
2
, Lucas Gilbert
1
, Matthew B. Jones
2
1
San Diego Supercomputer Center, UCSD, U.S.A.
{jianwu, altintas, iktome}@sdsc.edu
2
National Center for Ecological Analysis and Synthesis, UCSB, U.S.A.
{berkley, jones}@nceas.ucsb.edu
Abstract
Domain scientists synthesize different data and
computing resources to solve their scientific problems.
Making use of distributed execution within scientific
workflows is a growing and promising way to achieve
better execution performance and efficiency. This
paper presents a high-level distributed execution
framework, which is designed based on the distributed
execution requirements identified within the Kepler
community. It also discusses mechanisms to make the
presented distributed execution framework easy-to-use,
comprehensive, adaptable, extensible and efficient.
1. Introduction
Scientific workflow management systems, e.g.,
Taverna [1], Triana [2], Pegasus [3], Kepler [4],
ASKALON [7] and SWIFT [12], have demonstrated
their ability to help domain scientists solve scientific
problems by synthesizing different data and computing
resources. Scientific workflows can operate at different
levels of granularity, from low-level workflows that
explicitly move data around, start and monitor remote
jobs, etc. to high-level "conceptual workflows" that
interlink complex, domain specific data analysis steps.
Distributed execution and Grid workflows can be seen
as a type of scientific workflows. Most workflow
systems centralize execution [5], which often causes a
performance bottleneck. We summarize requirements
within the Kepler community and propose our
distributed execution framework to take advantage of
abundant distributed computing resources to achieve
better execution performance and efficiency. Based on
community feedback, our goals for the Kepler
distributed execution framework include the ability to
easily form ad-hoc networks of cooperating Kepler
instances. Each cooperating Kepler network can
impose access constraints and allows Kepler models or
sub-models to be run on participating instances. Once a
Kepler cooperating network has been created, it can
configure one or more subcomponents of a workflow
to be distributed across nodes of the newly constructed
network. The major contribution of this paper is
demonstrating a distributed scientific workflow
approach that combines an intuitive user interface,
collaborative features, and capabilities for distribution
of workflow tasks and the workflows themselves in a
single framework.
In Section 2, we discuss the background of
scientific workflow distributed execution. Sections 3
and 4 describe the conceptual architecture and
framework. We demonstrate a case study in Section 5
to show how the framework works. Finally, we
conclude and explain future work in Section 6.
2. Background
Our work is based on the following aspects:
structure of scientific workflow specifications, typical
distributed execution requirements as specified by
scientists, and prior work in distributed execution.
2.1. Scientific Workflow Specification
Structure
There are several different formats for representing
scientific workflows [14, 15, 16], but they generally
are graph descriptions that can be used to represent
three types of components: tasks, data and control
dependencies [5]. For example, in Figure 1 the tasks
T2 and T3 will be executed under different conditions.
Additionally, T4 needs to get data from either T2 or T3
before its execution. Since our framework incorporates
Fourth IEEE International Conference on eScience
978-0-7695-3535-7/08 $25.00 © 2008 IEEE
DOI 10.1109/eScience.2008.166
634