IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 12, DECEMBER 2008 1753
Transactions Letters
Convergent 2-D Subspace Learning With Null Space Analysis
Dong Xu, Shuicheng Yan, Member, IEEE, Stephen Lin, and Thomas S. Huang, Life Fellow, IEEE
Abstract—Recent research has demonstrated the success of
supervised dimensionality reduction algorithms 2DLDA and
2DMFA, which are based on the image-as-matrix representation,
in small sample size cases. To solve the convergence problem in
2DLDA and 2DMFA, we propose in this work two new schemes,
called Null Space based 2DLDA (NS2DLDA) and Null Space
based 2DMFA (NS2DMFA), and apply them to the challenging
multi-view face recognition task. First, we convert each 2-D face
image (matrix) into a vector and compute the first projection
matrix from the null space of the intra-class scatter matrix,
such that the samples from the same class are projected to the
same point. Then the data are projected and reconstructed with
. Finally, we re-organize the reconstructed datum into a matrix
and then compute the second projection direction , in the
form of a Kronecker product of two matrices, by maximizing the
inter-class scatter. A proof of algorithmic convergence is provided.
The experiments on two benchmark multi-view face databases,
the CMU PIE and FERET databases, demonstrate that NS2DLDA
outperforms Fisherface, Null Space LDA (NSLDA) and 2DLDA.
Additionally, NS2DMFA is also demonstrated to be more accurate
than MFA and 2DMFA for face recognition.
Index Terms—LDA, MFA, multiview face recognition, null space
LDA, 2DLDA, 2DMFA.
I. INTRODUCTION
S
UPERVISED dimensionality reduction algorithms
have achieved great success in face recognition. Ac-
cording to the image representation, these algorithms can be
roughly classified into two categories: image-as-vector and
image-as-matrix. The algorithms based on the image-as-vector
representation first convert a 2-D image into a 1-D vector
before dimensionality reduction. A typical example is Linear
Discriminant Analysis (LDA) [5], which seeks the projection
directions that maximize inter-class scatter and at the same
time minimize intra-class scatter. In face recognition, the
Manuscript received July 23, 2007; revised November 08, 2007. First pub-
lished September 23, 2008; current version published November 26, 2008. This
work was supported by the Singapore National Research Foundation Interactive
Digital Media R&D Program, under research Grant NRF2008IDM-IDM-004-
018. This paper was recommended by Associate Editor D. Schonfeld.
D. Xu is with the School of Computer Engineering, Nanyang Technological
University, 639798 Singapore (e-mail: dongxu@ntu.edu.sg).
S. Yan is with the Department of Electrical and Computer Engineering, Na-
tional University of Singapore, 117576 Singapore.
S. Lin is with Microsoft Research Asia, 100080 Beijing, China.
T. S. Huang is with the Beckman Institute, University of Illinois at Urbana-
Champaign, Urbana, IL 639798 USA.
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TCSVT.2008.2005799
intra-class scatter matrix is often singular because of the
small sample size compared with the large feature size. Several
variants of LDA have been proposed to avoid this problem.
Belhumeur et al. [1] proposed the PCA+LDA strategy, namely
Fisherface, in which PCA [16] is used to avoid the singularity
problem for . However, Fisherface uses only information
from the principal subspace, and ignores the null subspace of
. Observing that the null subspace of may also contain
discriminative information, Chen et al. [3] chose projection
directions by maximizing inter-class scatter with the constraints
that the projection directions lie within the null space of .
We refer to this algorithm as Null Space LDA (NSLDA) in
this work. Direct LDA [24] and dual-space LDA [17] were
also proposed to take advantage of the null space information
of . However, the aforementioned algorithms suffer from
insufficient learnability in small sample size cases, which is
a problem in real applications such as face recognition which
typically have small training sets and a large feature set. In
consequence, subspaces learnt from these algorithms may have
poor recognition performance on test data.
In the real world, there exists data such as image objects that
are intrinsically in the form of second or higher order tensors.
For example, gray-level images represent 2nd order tensor data
(a matrix), and can be expanded to a 3rd order tensor by rep-
resenting sets of images after Gabor filtering [20]. It is often
helpful to process the data in its original form and order [4],
[10]–[15], [20]–[23]. Recently, Ye et al. [23] and Yan et al. [21]
proposed 2DLDA and 2-D Marginal Fisher Analysis (2DMFA)
1
to directly conduct supervised dimensionality reduction with
image objects represented as matrices, an image-as-matrix rep-
resentation.
The methods in [21], [23] apply Rank-(R1, R2) decom-
position in pursuing the solution. The main advantage of the
image-as-matrix representation is in enhancing learnability. To
utilize intrinsic structure information such as the correlations
among the rows and columns, they directly deal with the datum
in its intrinsic form rather than aggregating all the information
into a single vector. This involves computation of several sub-
spaces with reduced feature dimension, which thus enhances
algorithmic learnability. Experiments demonstrate that when
the training set is small, 2DLDA and 2DMFA usually outper-
form vector based Fisherface and Marginal Fisher Analysis
(MFA), which are based on the image-as-vector representa-
tion. However, the main drawback of 2DLDA and 2DMFA is
1
The work in [21] is often referred to as TMFA because their work can deal
with arbitrary order tensors as inputs. We focus on gray-level images as inputs
in this work, so we refer to it as 2DMFA. A similar work was simultaneously
presented in [4].
1051-8215/$25.00 © 2008 IEEE