A nature based fusion scheme for multimodal biometric person identity verification at a distance Ho Chiung Ching , C.Eswaran Faculty of Information Technology,Multimedia University Cyberjaya, Selangor, Malaysia {ccho,eswaran}@mmu.edu.my Abstract— This paper presents a multimodal biometric verification scheme for face, gait and speech data as inspired by how verification is done at a distance in the natural world. Keywords-nature base, multimodal biometric verification, fusion I. INTRODUCTION Person identity verification is important for securing access to scarce or restricted resources. As security threats increases, the need to ascertain the claim of identity that a person makes becomes increasingly important. Person identity verification is the process of proving an identity claim (“I am person X”) and can be done through various approaches. Traditional approaches calls for the person to either have knowledge that proofs his identity, e.g. passwords or possesses an identity token which proofs who he is; for example a pre-verified building access card. The problem for both approaches is that passwords can be forgotten, while an identity token can be misplaced or stolen. A combination of both approaches , for example a building access card which requires a passkey ; can be more secure but nonetheless is still vulnerable to theft or a lapse in memory. As such, biometrics is a better solution. Biometrics [1] is the usage of unique physical or physiological attributes from a person as proof of their identity. Physical biometric modalities include facial features, speech, thumbprint, palm print and even the shape of a person’s ears and lips. Behavioral biometric modalities can be found in a person’s gait and cadence, as well as the person’s signature. All of these biometric features are unique to a person, and are mostly time invariant. Recent development in biometric based person identity verification has progressed from a single modality to a mesh of multiple biometric modalities. A multimodal biometric authentication scheme is more robust than a single modal authentication scene as there is greater tolerance for signal loss or degradation as there is more than one biometric modality considered [1]. In this respect, loss of integrity for single modal biometrics through signal loss or degradation is overcome by considering the other remaining modality. A composite result is achieved through fusion at either feature level or decision level for the multimodal biometric modalities. Person identity verification happens in the natural world as well as electronically – the age old challenge for a password from a distant stranger before permission is granted to proceed is conceptually identical to many computer based identification system, ranging from Identification Friend or Foe (IFF) systems on military radar systems to video based automatic person verification system. In nature, verification often takes place based on audio and visual features. For example, chimpanzees can tell mother and son pairings based on visual clue as shown in Parr and de Waal’s work [2]. A person can also classify another person as being familiar (known) or unfamiliar based on clues from the way the other person walks (gait), the way the person talks (speech) and finally a decision is made based on how the person looks like (facial). The accuracy of the classification increases as the physical distance between two person decreases – this happens as the facial features become more prominent, and at the same time the speech features becomes more distinct. II. PASS LITERATURE A. Human Visual System inspired video-based verification Widespread usage of computer vision for security related applications has encouraged deep interest in modeling the human visual system (HVS). One particular area of interest would be in how aspects HVS can be employed to improve video surveillance, especially in how the HVS is able to track shape, movement and color in spite of a wide range of illumination. Peerasathein [3] modeled the ventral stream which contains the primary visual cortex using a neural network classifier to enable video based object classification. Kim [4] represented the HVS as a parameterized monte carlo markov model in his work to classify 3D objects. The HVS’s physical and behavioral was modeled by Carnec and Barba in the context of facial recognition, which yielded good results. Current HVS based approaches for authentication either approximate the physical structures within the HVS or the behavioral aspect of HSV. Person verification is a task that is markedly different from person identification. Identification calls for a probe to be examined against a gallery sample in a database, whereas 2009 International Conference on Signal Acquisition and Processing 978-0-7695-3594-4/09 $25.00 © 2009 IEEE DOI 10.1109/ICSAP.2009.28 94