Volume XXX (2018), Number XXX pp. 1–14 COMPUTER GRAPHICS forum
Making Sense of Scientific Simulation Ensembles with Semantic
Interaction
M. Dahshan
1
, N. F. Polys
1
, R. S. Jayne
2
, and R. M. Pollyea
2
1
Department of Computer Science, Virginia Tech, USA
2
Department of Geosciences, Virginia Tech, USA
Abstract
In the study of complex physical systems, scientists use simulations to study the effects of different models and parameters. As
they seek to understand the influence and relationships among multiple dimensions, they typically run many simulations and vary
the initial conditions in what are known as 'ensembles'. Ensembles are then a number of runs that are each multidimensional
and multivariate. In order to understand the connections between simulation parameters and patterns in the output data, we
have been developing an approach to the visual analysis of scientific data that merges human expertise and intuition with
machine learning and statistics. Our approach is manifested in a new visualization tool, GLEE (Graphically-Linked Ensemble
Explorer), that allows scientists to explore, search, filter, and make sense of their ensembles. Our tool uses visualization and
semantic interaction techniques to enable scientists to: find similarities and differences between runs, find correlation between
different parameters, and explore relations and correlations between different runs and parameters. Our approach supports
scientists in selecting interesting subsets of runs to investigate and summarizing factors and statistics to show variations and
consistencies across different runs. In this paper, we evaluate our tool with experts to understand its strengths and weaknesses
for optimization and inverse problems.
CCS Concepts
•Scientific Visualization → Ensembles , Sensemaking;
1. Introduction
Recent advances in computing power and the availability of high-
performance computing have led to the feasibility of running com-
plex real-world simulations in an acceptable amount of time. Sci-
entists usually need to run their simulations multiple times using
different input conditions, simulation parameters, and simulation
models. This supports the scientist in interpreting the variability
in the system and gaining insights by alternating between models.
Through these multiple runs, they can gain a more complete under-
standing of the simulated phenomenon and model, and refine their
hypothesis and method for actual physical experiments. A set of
simulation runs is known as an ensemble: it represents a param-
eter study or a set of studies using different computational mod-
els and paramters. Scientists from a variety of disciplines, such as
aerodynamics, weather forecast climate, and computational fluid
dynamics, use ensembles to simulate complex systems, explore
unknowns in initial conditions, evaluate extreme cases, compare
structural characteristics of their models, and investigate parameter
sensitivity to assess the confidence in their findings. In other words,
this guides the scientist in interpreting the distributions within the
data, investigating the sensitivity of outputs to certain input param-
eters and understanding the similarities and dissimilarities between
ensemble members.
The analysis of ensemble data is a challenging task due to its
high multidimensionality, complexity, and size. Therefore, ensem-
ble visualization is a crucial and essential component in the analysis
process as it facilitates knowledge discoveries and helps the scien-
tist see the characteristic features of the data through graphical rep-
resentations. Such analysis of ensembles can help them find appro-
priate models and parameter ranges for hypothesized relationships
and outcomes. Moreover, ensemble visualization helps in measur-
ing the variability and sensitively of the model to its inputs and out-
puts and how output parameters react to input changes. Therefore,
the focus of this paper is the visual exploration and comparison of
the behaviors of simulations and their parameters.
Current research in the visual analysis of ensembles relies on
multiple techniques for showing the variability of the ensem-
ble members, major trends, and outliers. Some of these tech-
niques focus on studying the parameter space and measuring
the correlation between different parameters. Summary statis-
tics [PKRJ10, BPFG11, PWB
*
09a, MWK14, WMK13, SEG
*
15],
spaghetti plots [DNCP10, Det05], and probabilistic features such
as multivariate Gaussian distributions, histograms and kernel den-
sity estimates (KDE) are examples of these techniques [PPH12,
PW12]. Additionally, conventional visualization solutions such as
glyphs [HLNW11, PKRJ10, PMW13, SZD
*
10] and visual variables
© 2018 The Author(s)
Computer Graphics Forum © 2018 The Eurographics Association and John
Wiley & Sons Ltd. Published by John Wiley & Sons Ltd.