A nature based fusion scheme for multimodal biometric person identity verification
at a distance
Ho Chiung Ching , C.Eswaran
Faculty of Information Technology,Multimedia University
Cyberjaya, Selangor, Malaysia
{ccho,eswaran}@mmu.edu.my
Abstract— This paper presents a multimodal biometric
verification scheme for face, gait and speech data as inspired
by how verification is done at a distance in the natural world.
Keywords-nature base, multimodal biometric verification,
fusion
I. INTRODUCTION
Person identity verification is important for securing
access to scarce or restricted resources. As security threats
increases, the need to ascertain the claim of identity that a
person makes becomes increasingly important. Person
identity verification is the process of proving an identity
claim (“I am person X”) and can be done through various
approaches. Traditional approaches calls for the person to
either have knowledge that proofs his identity, e.g.
passwords or possesses an identity token which proofs who
he is; for example a pre-verified building access card. The
problem for both approaches is that passwords can be
forgotten, while an identity token can be misplaced or stolen.
A combination of both approaches , for example a building
access card which requires a passkey ; can be more secure
but nonetheless is still vulnerable to theft or a lapse in
memory. As such, biometrics is a better solution. Biometrics
[1] is the usage of unique physical or physiological attributes
from a person as proof of their identity. Physical biometric
modalities include facial features, speech, thumbprint, palm
print and even the shape of a person’s ears and lips.
Behavioral biometric modalities can be found in a person’s
gait and cadence, as well as the person’s signature. All of
these biometric features are unique to a person, and are
mostly time invariant. Recent development in biometric
based person identity verification has progressed from a
single modality to a mesh of multiple biometric modalities.
A multimodal biometric authentication scheme is more
robust than a single modal authentication scene as there is
greater tolerance for signal loss or degradation as there is
more than one biometric modality considered [1]. In this
respect, loss of integrity for single modal biometrics through
signal loss or degradation is overcome by considering the
other remaining modality. A composite result is achieved
through fusion at either feature level or decision level for the
multimodal biometric modalities.
Person identity verification happens in the natural world
as well as electronically – the age old challenge for a
password from a distant stranger before permission is
granted to proceed is conceptually identical to many
computer based identification system, ranging from
Identification Friend or Foe (IFF) systems on military radar
systems to video based automatic person verification system.
In nature, verification often takes place based on audio and
visual features. For example, chimpanzees can tell mother
and son pairings based on visual clue as shown in Parr and
de Waal’s work [2]. A person can also classify another
person as being familiar (known) or unfamiliar based on
clues from the way the other person walks (gait), the way the
person talks (speech) and finally a decision is made based on
how the person looks like (facial). The accuracy of the
classification increases as the physical distance between two
person decreases – this happens as the facial features become
more prominent, and at the same time the speech features
becomes more distinct.
II. PASS LITERATURE
A. Human Visual System inspired video-based
verification
Widespread usage of computer vision for security related
applications has encouraged deep interest in modeling the
human visual system (HVS). One particular area of interest
would be in how aspects HVS can be employed to improve
video surveillance, especially in how the HVS is able to
track shape, movement and color in spite of a wide range of
illumination. Peerasathein [3] modeled the ventral stream
which contains the primary visual cortex using a neural
network classifier to enable video based object classification.
Kim [4] represented the HVS as a parameterized monte carlo
markov model in his work to classify 3D objects. The HVS’s
physical and behavioral was modeled by Carnec and Barba
in the context of facial recognition, which yielded good
results. Current HVS based approaches for authentication
either approximate the physical structures within the HVS or
the behavioral aspect of HSV.
Person verification is a task that is markedly different
from person identification. Identification calls for a probe to
be examined against a gallery sample in a database, whereas
2009 International Conference on Signal Acquisition and Processing
978-0-7695-3594-4/09 $25.00 © 2009 IEEE
DOI 10.1109/ICSAP.2009.28
94