Hybrid Performance Modeling and Prediction of
Large-Scale Computing Systems
Sabri Pllana and Siegfried Benkner
Institute of Scientific Computing
Faculty of Computer Science
University of Vienna
Nordbergstrasse 15/C/3
1090 Vienna, Austria
Email: {pllana,sigi}@par.univie.ac.at
Fatos Xhafa
Department of Languages
and Informatics Systems
Polytechnic University of Catalonia
C/Jordi Girona 1-3
08034 Barcelona, Spain
Email: fatos@lsi.upc.edu
Leonard Barolli
Department of Information
and Communication Engineering
Fukuoka Institute of Technology
3-30-1 Wajiro-Higashi, Higashi-ku
Fukuoka 811-0295, Japan
Email: barolli@fit.ac.jp
Abstract—Performance is a key feature of large-scale com-
puting systems. However, theachieved performance when a
certain program is executed isignificantly lowerthan the
maximal theoretical performance of the large-scale computing
system. The model-based performance evaluation may be used
to support the performance-oriented program development for
large-scale computing systems. In this paperwe present a
hybrid approach for performance modeling and prediction of
paralleland distributed computing systems, which combines
mathematical modeling and discrete-event simulation. We use
mathematical modeling to develop parameterized performance
models for components of the system. Thereafter, we use discrete-
eventsimulation to describe the structure of system and the
interaction among its components. As a result, we obtain a high-
levelperformance model, which combines the evaluation speed
of mathematical models with the structure awareness and fidelity
of the simulation model. We evaluate empirically our approach
with a real-world material science program that comprises more
than 15,000 lines of code.
I. INTRODUCTION
The solution ofresource-demanding scientific and engi-
neering computational problems involvestheexecution of
programs on large-scale computing systems, which commonly
consist of multiple computational nodes, in order to solve large
problems or to reduce the time to solution for a single problem.
However, thereis a widening gap between the maximal
theoretical performance and the achieved performance when a
certain program is executed on a large-scale parallel and dis-
tributed computing system. This gap may be reduced by tuning
the performance of a program for a specific computing sys-
tem.Commonly, the programmer develops multiple versions
of the program following various parallelization strategies.
Thereafter, the programmer assesses the performance of each
program version, and selects the program version that achieves
the highest performance. The code-based performance tuning
of a program is a time-consuming and error-prone process that
involves many cycles of code editing, compilation, execution,
and performance analysis. This problem may be alleviated by
using the model-based performance evaluation.
In this paper we present a methodology and the correspond-
ing tool-support for performance modeling and prediction of
parallel and distributed computing systems, which may be used
in the process of performance-oriented program development
for providing performance prediction results starting from the
early program development stages. Based on the performance
model, the performance can be predicted and design decisions
can be influenced without time-consuming modifications of
large parts of an implemented program.
We propose a hybrid approach for performance modeling
and prediction of parallel and distributed computing systems,
which combines mathematical modeling and discrete-event
simulation. Ouraim is to combine the evaluation speed of
mathematical models with the structure awareness and fidelity
of the simulation model. For the purpose of evaluation of our
approach we have developed a performance modeling and
prediction system called Performance Prophet. We demon-
strate the usefulness of Performance Prophet by modeling
and simulating a real-world material science program that
comprises more than 15, 000 lines of code. In our case study,
the model evaluation with Performance Prophet on a single
processor workstation is several thousand times faster than the
execution time of the real program on our cluster.
The rest of this paper is organized as follows. Our approach
for hybrid performance modeling and prediction of parallel
and distributed computing systems is described in Section II.
We evaluate empirically our approach in Section III. The
related work is discussed in Section IV. Finally,Section V
concludes the paper and briefly describes the future work.
II. H YBRID PERFORMANCE MODELING AND PREDICTION
Commonly for performance modeling of computing systems
is used mathematical modeling (MathMod) or discrete event
simulation (DES). When applied separately, each ofthese
approaches has severe limitations.
Mathematical models commonly represent the whole com-
puting system as a symbolic expression that lacks the structural
information [1]. An example of a mathematical performance
model that models the program execution time is expressed as
follows,
T
P rogExec
= C
Op
T
Av
,
International Conference on Complex, Intelligent and Software Intensive Systems
0-7695-3109-1/08 $25.00 © 2008 IEEE
DOI 10.1109/CISIS.2008.20
132
International Conference on Complex, Intelligent and Software Intensive Systems
0-7695-3109-1/08 $25.00 © 2008 IEEE
DOI 10.1109/CISIS.2008.20
132
International Conference on Complex, Intelligent and Software Intensive Systems
0-7695-3109-1/08 $25.00 © 2008 IEEE
DOI 10.1109/CISIS.2008.20
132