Cluster Comput
DOI 10.1007/s10586-017-1018-x
Modeling and predicting execution time of scientific workflows
in the Grid using radial basis function neural network
Farrukh Nadeem
1
· Daniyal Alghazzawi
1
· Abdulfattah Mashat
2
·
Khalid Fakeeh
1
· Abdullah Almalaise
1
· Hani Hagras
3
Received: 1 November 2016 / Revised: 14 April 2017 / Accepted: 24 June 2017
© Springer Science+Business Media, LLC 2017
Abstract With the maturity of electronic science (e-science)
the scientific applications are growing to be more complex
composed of a set of coordinating tasks with complex depen-
dencies among them referred to as workflows. For optimized
execution of workflows in the Grid, the high level middleware
services (like task scheduler, resource broker, performance
steering service etc.) need in-advance estimates of workflow
execution times. However, modeling and predicting work-
flow execution time in the Grid is complex due to several
tasks in a workflow, their distributed execution on multi-
ple heterogeneous Grid-sites, and dynamic behaviour of the
shared Grid resources. In this paper, we describe a novel
method based on radial basis function neural network to
model and predict workflow execution time in the Grid.
We model workflows execution time in terms of attributes
describing workflow structure and execution runtime infor-
mation. To further refine our models, we employ principle
component analysis to eliminate attributes of lesser impor-
tance. We recommend a set of only 14 attributes (as compared
with total 21) to effectively model workflow execution time.
Our reduced set of attributes improves the prediction accu-
B Farrukh Nadeem
fabdullatif@kau.edu.sa
Abdulfattah Mashat
asmashat@uj.edu.sa
Hani Hagras
hani@essex.ac.uk
1
Department of Information Systems, Faculty of Computing
and Information Technology, King Abdulaziz University,
Jeddah, Saudi Arabia
2
University of Jeddah, Jeddah, Saudi Arabia
3
School of Computer Science and Electronic Engineering, The
Computational Intelligence Centre, University of Essex,
Colchester, UK
racy by 16%. Results of our prediction experiments for three
real-world scientific workflows are presented to show that
our predictions are more accurate than the two best methods
from related work so far.
Keywords Scientific workflow applications · Distributed
execution of scientific workflows · Workflow execution time
prediction in the Grid
1 Introduction
Computational Grids enable application developers to aggre-
gate heterogeneous computational and storage resources
scattered around the globe for large-scale scientific and
engineering research. Scientific workflow applications (later
referred as scientific workflows or just workflows) have
recently emerged as an important paradigm for representing
and managing complex scientific computations. Typically a
workflow consists of a set of tasks (software executions or
data transfers), which are coordinated by control and data
flow dependencies to solve a complex problem. Workflow
applications are usually executed on the Grid through work-
flow management systems like Askalon [14], GridFlow [4],
Pegasus [10] etc. for their automatic execution. The runtime
environment of such workflow management systems sched-
ule and manage the execution of workflow tasks on multiple
Grid-sites with the objective to minimize the workflow exe-
cution time. Workflow execution time is widely considered
as a metric to measure workflow performance [14].
Workflow schedulers, enactment engines and perfor-
mance analysis services are commonly part of the runtime
environments that rely on execution time modeling of scien-
tific applications in order to take crucial strategic decisions
and to determine the causes for performance problems.
123