The Visual Computer
https://doi.org/10.1007/s00371-020-01817-5
ORIGINAL ARTICLE
A review, framework, and R toolkit for exploring, evaluating, and
comparing visualization methods
Stephen L. France
1
· Ulas Akkucuk
2
© Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract
This paper gives a review and synthesis of methods of evaluating dimensionality reduction techniques. Particular attention is
paid to rank-order neighborhood evaluation metrics. A framework is created for exploring dimensionality reduction quality
through visualization. An associated toolkit is implemented in R. The toolkit includes scatterplots, heat maps, loess smoothing,
performance lift diagrams, and animation. The overall rationale is to help researchers compare dimensionality reduction
techniques and use visual insights to help select and improve techniques. Examples are given for dimensionality reduction in
manifolds and for dimensionality reduction applied to fashion image and consumer survey datasets.
Keywords Dimensionality reduction · Mapping · Solution quality · Model selection
1 Introduction
The problem of dimensionality reduction is core to statistics,
machine learning, and visualization. High-dimensional data
can contain a large amount of noise and importantly for visu-
alization, and the human brain can only comprehend a limited
number of dimensions. Thus, there is a need to reduce data
into an interpretable format by converting high-dimensional
data into a lower number of dimensions, which can sub-
sequently be visualized using lower-dimensional plots. To
meet the need for dimensionality reduction methods, a
plethora of algorithms and associated fitting methods have
been developed. A researcher wishing to perform dimen-
sionality reduction for visualization will be presented with a
choice of hundreds of algorithms. Which algorithm should be
used? This paper describes a visualization framework called
QVisVis and associated software tools implemented in R to
help choose dimensionality reduction methods, tune these
methods, and visually evaluate the quality of dimensionality
B Stephen L. France
sfrance@business.msstate.edu
Ulas Akkucuk
ulas.akkucuk@boun.edu.tr
1
Mississippi State University, Mississippi State, MS 39762,
USA
2
Department of Managment, Bogazici University, 34342
Bebek, Istanbul, Turkey
reduction solutions. The major contributions of this paper are
to review and synthesize the previous work on visualization
performance metrics, create an overall visualization frame-
work for “visualizing” visualization quality, and implement
the framework in an R toolkit.
1.1 Visualization design and evaluation
The roots of much of modern data-based visualization come
from exploratory data analysis, which was popularized by
John Tukey [108], who developed an array of simple tools,
such as the box plot, to help summarize, explore, and ulti-
mately gain insight from data. This idea of “exploration”
is still core to modern visualization. Visualization explo-
ration [47] can be thought of as a process where a user tunes
parameters to transform and explore data. At each stage of
the process, parameters are passed to a visualization trans-
form [112] function, which creates the visualization, which
the user then uses to further train parameters, as part of a feed-
back loop. When implementing data visualization systems,
both artistic [119] and data-based engineering considera-
tions come into play. An overarching consideration, which
subsumes both artistic and engineering aspects, is that of
design [5]. This design-based view can be combined with the
previously described process-based view in a design activity
framework [70]. Here, the researcher will try to understand
the problem along with its opportunities and constraints and
123