IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 5, OCTOBER 2010 1359 Robust Classiﬁers for Data Reduced via Random Projections Angshul Majumdar and Rabab K. Ward Abstract—The computational cost for most classiﬁcation algo- rithms is dependent on the dimensionality of the input samples. As the dimensionality could be high in many cases, particularly those associated with image classiﬁcation, reducing the dimensionality of the data becomes a necessity. The traditional dimensionality re- duction methods are data dependent, which poses certain practical problems. Random projection (RP) is an alternative dimensional- ity reduction method that is data independent and bypasses these problems. The nearest neighbor classiﬁer has been used with the RP method in classiﬁcation problems. To obtain higher recognition accuracy, this study looks at the robustness of RP dimensionality reduction for several recently proposed classiﬁers—sparse classi- ﬁer (SC), group SC (along with their fast versions), and the nearest subspace classiﬁer. Theoretical proofs are offered regarding the robustness of these classiﬁers to RP. The theoretical results are conﬁrmed by experimental evaluations. Index Terms—Classiﬁcation, face recognition, random projec- tion (RP). I. I NTRODUCTION T HE TERM “compressive classiﬁcation” (CC) was ﬁrst coined in [1]. It originated with a new paradigm in sig- nal processing called “compressive sampling” or “compressed sensing” (CS) [2], [3]. CS combines dimensionality reduction with data acquisition by collecting a (random) lower dimen- sional projection of the original data instead of sampling it. CC refers to a new class of classiﬁcation methods that are robust to data acquired using CS. Only a few properties are preserved by CS data acquisition, and compressive classiﬁers are designed to exploit these properties so that the recognition accuracy on data acquired by CS is approximately the same as that on data acquired by traditional sampling. In this work, we discuss a group of such classiﬁers which are robust to data acquired thus and fall under the category of CC. There is a basic difference that separates CC from conven- tional classiﬁcation methods. In conventional classiﬁcation, the data are acquired by traditional (Nyquist) sampling. Once all the data are obtained, a data-dependent dimensionality reduc- tion technique is employed; data acquisition and dimensionality reduction are disjoint activities. CC operates on data acquired by a CS technique, where dimensionality reduction occurs simultaneously with data acquisition. Thus, CC works with a dimensionality reduction method that is data independent, Manuscript received May 26, 2009; revised November 30, 2009; accepted December 7, 2009. Date of publication January 26, 2010; date of current version September 15, 2010. This paper was recommended by Associate Editor F. Karray. The authors are with the Department of Electrical and Computer Engineer- ing, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada (e-mail: angshulm@ece.ubc.ca; rababw@ece.ubc.ca). Digital Object Identiﬁer 10.1109/TSMCB.2009.2038493 whereas the dimensionality reduction techniques in traditional classiﬁcation are data dependent (e.g., principal component analysis, linear discriminant analysis, etc.). For some practical situations, data-dependent dimensionality reduction methods are not efﬁcient. Consider a practical scenario of face authentication in a bank or an ofﬁce. In a bank, new clients are added daily to the database, and in some ofﬁces, employees are also added on a regular basis. Suppose that, at a certain given time, face images of 200 people are available, and following conventional face recognition methods (e.g., eigenface and Fisherface), a data-dependent dimensionality reduction is employed, resulting in a high- to low-dimensional projection matrix. When images of ten more people are added (e.g., the next day), the projection matrix from the high-to-low dimension must be recalculated for all 210 people. Unfortunately, there is no way for the old projection matrix to be updated by the new data (reducing the complexity of such updates is an active area of research [15], [16]). For such cases, a data-independent dimensionality reduction method is desirable. Such a scenario can easily be handled by CC. CC uses a random projection (RP) matrix for dimensionality reduction. The projection matrix is data independent (it can be a Gaussian- or a Bernoulli-type random matrix or a partial Fourier matrix). Compressive classiﬁers are data independent in the sense that they do not require retraining [like support vector machines (SVMs) or artiﬁcial neural networks (ANNs)] whenever new data are added. Dimensionality reduction by RP (i.e., CS data acquisition) [41] gives good results only if the classiﬁer is based on a distance-based measure (e.g., Euclidean or cosine). Conse- quently, the nearest neighbor (NN) classiﬁer is robust to such randomly projected data and can be used as a compressive classiﬁer. Other studies have shown empirically that RP can also be used in conjunction with certain ANNs [4] and SVMs [5]. However, both ANNs and SVMs have a data-dependent training phase, i.e., they need to be retrained whenever new data are added. As a result, ANNs or SVMs are not computationally efﬁcient solutions to the aforementioned problem. Hence, we will not consider these classiﬁers in this work. The idea behind CC is to provide data-independent solutions for dimensionality reduction and classiﬁcation problems. Traditionally, it is assumed that the training phase is off- line, so constraint on the time/computation during training is weak. In this case, the existing sophisticated methods related to dimensionality reduction [6]–[12] and classiﬁcation [13], [14] can be employed. It should be mentioned that some effort in online training is discernible in current face recognition research [17]. Traditionally, it was assumed that the training phase is ofﬂine, and the training samples are ﬁxed, i.e., do not change with time. However, practical scenarios dictate updating 1083-4419/$26.00 © 2010 IEEE