FACE ALIGNMENT BASED ON THE MULTI-SCALE LOCAL FEATURES Cong Geng, Xudong Jiang Nanyang Technological University Electrical and Electronic Engineering Nanyang Link, Singapore 639798 ABSTRACT Many face recognition algorithms depend on careful position- ing of face images into the same canonical pose. Currently, this positioning is usually done by detecting the locations of eyes. And the face images are transformed to the same posi- tions according to the eye coordinates detected. In this paper, we describe a method based on multi-scale local features to achieve face alignment automatically not just dependent on the localizations of two eyes. Given an unaligned face image resulting from a face detector and a set of aligned face images in the data set, we build an automatic transformation mecha- nism, under which the unaligned face image can be precisely aligned for the following recognition process. Our alignment method improves performance on face recognition tasks, over images aligned by many other algorithms. Index Terms— face alignment, multi-scale local features, eye detection, face recognition 1. INTRODUCTION Since the Principal Component Analysis (PCA) [1] and the Linear Discriminant Analysis (LDA) [2] were introduced into face recognition, various holistic approaches have been exten- sively studied [3]. However, the holistic approaches require a preprocessing procedure to normalize the face image varia- tions in pose and scale, which is not an easy task because it depends on the accurate detection of at least two landmarks from the face image. Some algorithms for eye localization have been proposed based on the eyeball [4, 5, 6, 7, 8]. How- ever, in many real applications the appearances of eyeball are not distinct or missing due to expressions, occlusions, illu- minations or image noise. Hence, some algorithms localize multiple facial features like corners of eyes, nostrils, the tip of nose, corners of mouth, etc. Face alignment is performed based on these semantic features [9]. The same problem en- countered in the detection of eyes remains. Moreover, in the training process, these semantic features are hand-annotated, which is very labor-consuming. In [10], an unsupervised ap- proach is proposed for face alignment, which is not based on the localizations of semantic facial features. As the perfor- mance of the face alignment algorithm inﬂuences the ﬁnal recognition performance, many research papers on the holis- tic approaches report the recognition performance on the pre- normalized faces. The recognition performance will deteri- orate considerably if the manual process is replaced by an automatic landmark detection algorithm. In contrast to holistic methods, some local feature based approaches for face recognition are more robust to varia- tions in pose and scale. Furthermore, unlike the holistic approaches, the face normalization is an integrated part of the local approaches [11, 12, 13, 14]. To solve the alignment problem in holistic approaches, we propose a face alignment strategy based on multi-scale local features instead of just two speciﬁc eye points. In [15], a method for partial face alignment in near infrared (NIR) video sequences is proposed based on SIFT [11]. Different from this approach [15], the anchor points in our template face image are detected and learned automatically. In the alignment stage, we do not use shape constraint [15] which is limited to align frontal faces with slight pose variations. Instead, we use Hough trans- form to cluster keypoints with similar poses and then apply afﬁne transform to each cluster to remove spurious corre- spondences. In this way, we can align faces with large pose variations. The performance of our face alignment strategy is validated by face recognition tasks using holistic approaches LDA [2], UFS [16] and ERE [17]. Experimental results on Georgia Tech (GT) [18] and ORL [19] databases show that our alignment approach outperforms those based on localiza- tion of eyes [4, 5, 6, 7, 8], the localization of facial parts [9] and the congealing approach [10]. 2. FACE ALIGNMENT The purpose of our alignment is to rectify face images into the same canonical pose for subsequent holistic recognition tasks, rather than localizing facial feature points such as eye-brows, eyes, nose, mouth and contour of chin as many papers did. As mentioned in Section 1, face alignment algorithms based on localizations of facial parts are not reliable as the appear- ances of semantic facial features vary with expressions, illu- minations, occlusions or image noise. Hence, we propose an approach for face alignment not just relying on the semantic facial parts. 1517 978-1-4673-0046-9/12/$26.00 ©2012 IEEE ICASSP 2012