IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 12, DECEMBER 2008 1753 Transactions Letters Convergent 2-D Subspace Learning With Null Space Analysis Dong Xu, Shuicheng Yan, Member, IEEE, Stephen Lin, and Thomas S. Huang, Life Fellow, IEEE Abstract—Recent research has demonstrated the success of supervised dimensionality reduction algorithms 2DLDA and 2DMFA, which are based on the image-as-matrix representation, in small sample size cases. To solve the convergence problem in 2DLDA and 2DMFA, we propose in this work two new schemes, called Null Space based 2DLDA (NS2DLDA) and Null Space based 2DMFA (NS2DMFA), and apply them to the challenging multi-view face recognition task. First, we convert each 2-D face image (matrix) into a vector and compute the ﬁrst projection matrix from the null space of the intra-class scatter matrix, such that the samples from the same class are projected to the same point. Then the data are projected and reconstructed with . Finally, we re-organize the reconstructed datum into a matrix and then compute the second projection direction , in the form of a Kronecker product of two matrices, by maximizing the inter-class scatter. A proof of algorithmic convergence is provided. The experiments on two benchmark multi-view face databases, the CMU PIE and FERET databases, demonstrate that NS2DLDA outperforms Fisherface, Null Space LDA (NSLDA) and 2DLDA. Additionally, NS2DMFA is also demonstrated to be more accurate than MFA and 2DMFA for face recognition. Index Terms—LDA, MFA, multiview face recognition, null space LDA, 2DLDA, 2DMFA. I. INTRODUCTION S UPERVISED dimensionality reduction algorithms have achieved great success in face recognition. Ac- cording to the image representation, these algorithms can be roughly classiﬁed into two categories: image-as-vector and image-as-matrix. The algorithms based on the image-as-vector representation ﬁrst convert a 2-D image into a 1-D vector before dimensionality reduction. A typical example is Linear Discriminant Analysis (LDA) [5], which seeks the projection directions that maximize inter-class scatter and at the same time minimize intra-class scatter. In face recognition, the Manuscript received July 23, 2007; revised November 08, 2007. First pub- lished September 23, 2008; current version published November 26, 2008. This work was supported by the Singapore National Research Foundation Interactive Digital Media R&D Program, under research Grant NRF2008IDM-IDM-004- 018. This paper was recommended by Associate Editor D. Schonfeld. D. Xu is with the School of Computer Engineering, Nanyang Technological University, 639798 Singapore (e-mail: dongxu@ntu.edu.sg). S. Yan is with the Department of Electrical and Computer Engineering, Na- tional University of Singapore, 117576 Singapore. S. Lin is with Microsoft Research Asia, 100080 Beijing, China. T. S. Huang is with the Beckman Institute, University of Illinois at Urbana- Champaign, Urbana, IL 639798 USA. Color versions of one or more of the ﬁgures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/TCSVT.2008.2005799 intra-class scatter matrix is often singular because of the small sample size compared with the large feature size. Several variants of LDA have been proposed to avoid this problem. Belhumeur et al. [1] proposed the PCA+LDA strategy, namely Fisherface, in which PCA [16] is used to avoid the singularity problem for . However, Fisherface uses only information from the principal subspace, and ignores the null subspace of . Observing that the null subspace of may also contain discriminative information, Chen et al. [3] chose projection directions by maximizing inter-class scatter with the constraints that the projection directions lie within the null space of . We refer to this algorithm as Null Space LDA (NSLDA) in this work. Direct LDA [24] and dual-space LDA [17] were also proposed to take advantage of the null space information of . However, the aforementioned algorithms suffer from insufﬁcient learnability in small sample size cases, which is a problem in real applications such as face recognition which typically have small training sets and a large feature set. In consequence, subspaces learnt from these algorithms may have poor recognition performance on test data. In the real world, there exists data such as image objects that are intrinsically in the form of second or higher order tensors. For example, gray-level images represent 2nd order tensor data (a matrix), and can be expanded to a 3rd order tensor by rep- resenting sets of images after Gabor ﬁltering [20]. It is often helpful to process the data in its original form and order [4], [10]–[15], [20]–[23]. Recently, Ye et al. [23] and Yan et al. [21] proposed 2DLDA and 2-D Marginal Fisher Analysis (2DMFA) 1 to directly conduct supervised dimensionality reduction with image objects represented as matrices, an image-as-matrix rep- resentation. The methods in [21], [23] apply Rank-(R1, R2) decom- position in pursuing the solution. The main advantage of the image-as-matrix representation is in enhancing learnability. To utilize intrinsic structure information such as the correlations among the rows and columns, they directly deal with the datum in its intrinsic form rather than aggregating all the information into a single vector. This involves computation of several sub- spaces with reduced feature dimension, which thus enhances algorithmic learnability. Experiments demonstrate that when the training set is small, 2DLDA and 2DMFA usually outper- form vector based Fisherface and Marginal Fisher Analysis (MFA), which are based on the image-as-vector representa- tion. However, the main drawback of 2DLDA and 2DMFA is 1 The work in [21] is often referred to as TMFA because their work can deal with arbitrary order tensors as inputs. We focus on gray-level images as inputs in this work, so we refer to it as 2DMFA. A similar work was simultaneously presented in [4]. 1051-8215/$25.00 © 2008 IEEE