Biologically Inspired Lighting Invariant Facial
Identity Recognition
Bharath Ramesh, Liushu Huang, Chao Tian, Cheng Xiang, and Tong Heng Lee
Department of Electrical and Computer Engineering
National University of Singapore, Singapore 117576
Email: elexc@nus.edu.sg (Cheng Xiang)
Abstract—Over the past decade, a considerable amount of
literature has been published on face recognition. Since recogni-
tion of frontal face images under controlled settings has become
easy to achieve, a number of recent studies have emphasized the
importance of robustness to variations in pose and illumination.
So in this paper, we undertake the task of recognizing face images
taken under drastic lighting variations using a hierarchical facial
identity recognition framework inspired by the human vision
system. The proposed system employs a novel log polar-encoding
of Gabor filtered outputs, in order to extract scale and rotation
invariant edge information from face images. Besides the novel
encoding strategy, an explicit pre-processing step is proposed to
deal with drastic lighting changes. We tested the facial identity
recognition framework on two popular face databases, Yale and
ORL face database, and obtained recognition rates on par with
recent works. Besides these standard databases, we tested on
the Yale B database, which was specifically designed to test
illumination invariance. In summary, we have outperformed state-
of-the-art methods on the challenging Yale B face database using
the proposed facial identity recognition framework.
I. I NTRODUCTION
In general, face recognition systems can be categorized
into two: holistic matching and local matching. Holistic or
global matching treats the whole face region as a single entity
for classification (e.g. [1], [2]). On the other hand, local or
component matching divides the face region into sub-regions
and combines the local statistics for classifying the face image
[3]. Out of the two methods, local matching is less sensitive
to pose changes & occlusion [4], and therefore it is adopted
in this work. In the local matching strategy, there are various
encoding strategies for the local patches. For instance, gray-
level alone is used in some works, or filtered responses [5],
or encoding of filtered responses [6]. The encoding strategy
serves two purposes: (1) enforce invariant properties, and (2)
downsample the filtered patches to decrease computational
time. In this paper, we address both these goals using a log-
polar raster sampling on Gabor filtered outputs, followed by
spectral analysis using Fourier transform.
The log-polar transform (LPT) [7] simulates the area of
foveal vision in the human vision system, and thereby achieves
a conformal mapping equivalent to a single instance of a visual
fixation. In the computer vision literature, the most successful
application of log-polar mapping is in the highly regarded
shape descriptor, the shape context [8]. However, it is to be
noted that shape context creates log-polar histograms instead
of using the original LPT, which is exponentially sampling the
image at the intersection of rings and wedges of the transform.
Due to this exponential polar sampling, scale and rotation
changes in the Cartesian image correspond to horizontal and
vertical shifts in the log-polar domain, respectively [7]. Subse-
quently, the Fourier transform modulus can be used to enforce
translation invariance, and two log-polar images of similar
facial parts will have “similar” Fourier transform magnitude.
Nevertheless, a similar representation does not eliminate noise
or guarantee discriminative properties.
Feature extraction is meant to improve the performance of
the classifier, by discarding irrelevant information such as noise
and redundancy from the set of input features [9]. While noise
can be readily regarded as a hindrance to optimal classification,
it has also been observed that if the number of training samples
is far less than the feature dimension (small sample size),
then the ‘curse of dimensionality’ degrades the performance
of the classifier [10]. From the earliest methods like principal
component analysis (PCA) [11], Fisher’s linear discriminant
(FLD) [12] to recent variants of recursive Fisher’s linear
discriminant (RFLD) [13], the goal of feature extraction is
to project the input features to a discriminant low-dimensional
subspace. In this paper, we propose the combined usage of
PCA, FLD and RFLD to improve classifier accuracy by reduc-
ing the LPT feature dimension and eliminating noisy features.
Thus, the feature extraction module is crucial for obtaining a
discriminative representation of similar face images. The other
important component in the framework is the pre-processing
needed for dealing with drastic lighting variations.
To tackle lighting problems in face recognition, three
general approaches have been attempted. They are illumination
normalization, illumination invariant feature extraction and
face modeling [14]. The first approach aims to mitigate lighting
variations by applying global image processing techniques
like histogram equalization, logarithmic transform [15], etc.
The second approach targets at extracting facial features that
are invariant to lighting conditions. Edge maps and quotient
images [5] are examples of features from this category. While
these extracted features aim to be lighting invariant, they are
unlikely to be so when large lighting variations are present. The
third approach uses images under different lighting conditions
to construct a generative 3-D face illumination cone [16],
which is then estimated using low-dimensional linear subspace.
A limitation of this approach is the need for training images
under varying lighting conditions. However, as the images
are likely to be taken at one single setting, the lighting
variations are unlikely to be low. Therefore, face modeling
is not very appropriate for practical scenarios. In order to
have a robust pre-processing method, we adopt illumination
normalization to reduce lighting variations across images. In
978-1-4799-7862-5/15/$31.00 © 2015 IEEE