Journal of Mathematical Imaging and Vision https://doi.org/10.1007/s10851-019-00902-2 On Orthogonal Projections for Dimension Reduction and Applications in Variational Loss Function for Learning Problems A. Breger 1 · J. I. Orlando 2 · P. Harar 3 · M. Dörﬂer 1 · S. Klimscha 2 · C. Grechenig 2 · B. S. Gerendas 1,2 · U. Schmidt-Erfurth 2 · M. Ehler 1 Received: 30 December 2018 / Accepted: 16 August 2019 © The Author(s) 2019 Abstract The use of orthogonal projections on high-dimensional input and target data in learning frameworks is studied. First, we investigate the relations between two standard objectives in dimension reduction, preservation of variance and of pairwise relative distances. Investigations of their asymptotic correlation as well as numerical experiments show that a projection does usually not satisfy both objectives at once. In a standard classiﬁcation problem, we determine projections on the input data that balance the objectives and compare subsequent results. Next, we extend our application of orthogonal projections to deep learning tasks and introduce a general framework of augmented target loss functions. These loss functions integrate additional information via transformations and projections of the target data. In two supervised learning problems, clinical image segmentation and music information classiﬁcation, the application of our proposed augmented target loss functions increases the accuracy. Keywords Orthogonal Projection · Dimension reduction · Preservation of data characteristics · Supervised learning · Target features 1 Introduction Linear dimension reduction is commonly used for prepro- cessing of high-dimensional data in complicated learning frameworks to compress and weight important data features. In contrast to nonlinear approaches, the use of orthogonal projections is computationally cheap, since it corresponds to a simple matrix multiplication. Conventional approaches apply speciﬁc projections that preserve essential information and complexity within a more compact representation. The projector is usually selected by optimizing distinct objec- tives, such as information preservation of the sample variance or of pairwise relative distances. Widely used orthogonal projections for dimension reduction are variants of the prin- B A. Breger anna.breger@univie.ac.at 1 Department of Mathematics, University of Vienna, Vienna, Austria 2 Department of Ophthalmology, Medical University of Vienna, Vienna, Austria 3 Department of Telecommunications, Brno University of Technology, Brno, Czech Republic cipal component analysis (PCA) that maximize the variance of the projected data [37]. Preservation of relative pairwise distances asks for a near-isometric embedding, and random projections guarantee this embeddings with high probabil- ity, cf. [5,15] and see also [1,6,12,27,30,35]. The use of random projections is especially favorable for large, high- dimensional data [48], since the computational complexity is just O (dkm), e.g., using the construction in [1], with d , k ∈ N being the original and lower dimensions and m ∈ N the number of samples. In contrast, PCA needs O (d 2 m) + O (d 3 ) operations [24]. Moreover, tasks that do not have all data available at once, e.g., data streaming, ask for dimension reduction methods that are independent of the data. In the present manuscript, we study orthogonal projections regarding the interplay between (O1) preservation of variance, (O2) preservation of pairwise relative distances, aiming for a sufﬁcient lower-dimensional data representa- tion. We shall consider the Euclidean distance exclusively since it is most widely used in applications, especially 123