Journal of Mathematical Imaging and Vision
https://doi.org/10.1007/s10851-019-00902-2
On Orthogonal Projections for Dimension Reduction and Applications
in Variational Loss Function for Learning Problems
A. Breger
1
· J. I. Orlando
2
· P. Harar
3
· M. Dörfler
1
· S. Klimscha
2
· C. Grechenig
2
· B. S. Gerendas
1,2
·
U. Schmidt-Erfurth
2
· M. Ehler
1
Received: 30 December 2018 / Accepted: 16 August 2019
© The Author(s) 2019
Abstract
The use of orthogonal projections on high-dimensional input and target data in learning frameworks is studied. First, we
investigate the relations between two standard objectives in dimension reduction, preservation of variance and of pairwise
relative distances. Investigations of their asymptotic correlation as well as numerical experiments show that a projection does
usually not satisfy both objectives at once. In a standard classification problem, we determine projections on the input data
that balance the objectives and compare subsequent results. Next, we extend our application of orthogonal projections to
deep learning tasks and introduce a general framework of augmented target loss functions. These loss functions integrate
additional information via transformations and projections of the target data. In two supervised learning problems, clinical
image segmentation and music information classification, the application of our proposed augmented target loss functions
increases the accuracy.
Keywords Orthogonal Projection · Dimension reduction · Preservation of data characteristics · Supervised learning · Target
features
1 Introduction
Linear dimension reduction is commonly used for prepro-
cessing of high-dimensional data in complicated learning
frameworks to compress and weight important data features.
In contrast to nonlinear approaches, the use of orthogonal
projections is computationally cheap, since it corresponds
to a simple matrix multiplication. Conventional approaches
apply specific projections that preserve essential information
and complexity within a more compact representation. The
projector is usually selected by optimizing distinct objec-
tives, such as information preservation of the sample variance
or of pairwise relative distances. Widely used orthogonal
projections for dimension reduction are variants of the prin-
B A. Breger
anna.breger@univie.ac.at
1
Department of Mathematics, University of Vienna, Vienna,
Austria
2
Department of Ophthalmology, Medical University of
Vienna, Vienna, Austria
3
Department of Telecommunications, Brno University of
Technology, Brno, Czech Republic
cipal component analysis (PCA) that maximize the variance
of the projected data [37]. Preservation of relative pairwise
distances asks for a near-isometric embedding, and random
projections guarantee this embeddings with high probabil-
ity, cf. [5,15] and see also [1,6,12,27,30,35]. The use of
random projections is especially favorable for large, high-
dimensional data [48], since the computational complexity
is just O (dkm), e.g., using the construction in [1], with
d , k ∈ N being the original and lower dimensions and
m ∈ N the number of samples. In contrast, PCA needs
O (d
2
m) + O (d
3
) operations [24]. Moreover, tasks that do
not have all data available at once, e.g., data streaming, ask
for dimension reduction methods that are independent of the
data.
In the present manuscript, we study orthogonal projections
regarding the interplay between
(O1) preservation of variance,
(O2) preservation of pairwise relative distances,
aiming for a sufficient lower-dimensional data representa-
tion. We shall consider the Euclidean distance exclusively
since it is most widely used in applications, especially
123