Robust Landmark Localization and Tracking for Improved Facial Expression Analysis Hamdi Dibeklio˘ glu, Albert Ali Salah, Theo Gevers Intelligent Systems Lab Amsterdam, University of Amsterdam {h.dibeklioglu, a.a.salah, th.gevers}@uva.nl Keywords: Facial landmarking, facial expression analysis, structural prior Abstract Automatic facial expression analysis requires lo- calization and tracking of facial features in an accurate and robust manner. We describe a statis- tical method for automatic facial landmark locali- zation, which is then used to improve a state-of- the-art deformable tracking based facial expression recognition algorithm. Our landmarking is based on Gabor wavelet features modeled with mixtures of factor analyzers. Once the landmarks are auto- matically located, they are used to initialize a tracker that maintains a simplified 3D face model matched to the appearance of the face. The deformations are classified with a na¨ ıve Bayes scheme into six expres- sion categories. We test the proposed methods exten- sively on cross-database experiments conducted on FRGC, Cohn-Kanade, and Bosphorus face datasets. Our results show that the statistical landmarking method we propose is robust in its generalization and increases the accuracy of expression recognition significantly. 1 Introduction Evaluation of facial expression relies on accurate face detection, face registration, localization of fiducial points in faces, and classification of shape/appearance information into expressions. When temporal information is available, tracking and temporal modeling also enter the picture. The pipeline of a facial expression analysis method starts with face detection, and often proceeds by lo- cating several fiducial points on detected faces, also called anchor points, or landmarks. Usually, cor- ners of eyes and eyebrows, centers of irises, nose tip, mouth corners, and the tip of the chin are used as key landmarks. These key landmarks are generally sufficient for face registration. However, more land- marks (20 to 60 points) are required in facial expres- sion analysis. Facial surface deformations caused by expressions can be described by movements of selected facial fea- ture points. If these points are discriminative enough, and are detected accurately, deformation analysis can classify facial expressions. Nevertheless changing pose, resolution and illumination conditions make facial landmark localization a challenging problem. Especially, statistical models can fail if the variation shown in the training set is not sufficient enough for generalization of unseen test samples. In this paper we describe a system for facial expression analysis which encompasses six basic emotional expressions (i.e. happiness, fear, surp- rise, anger, disgust, and sadness). Our emphasis is however on automatic landmarking, the effect of which we assess on subsequent expression catego- rization. We improve a recent facial landmarking al- gorithm by introducing a prior conditioned on face detection. We show via extensive cross-database tests that the landmarking method we use is robust. Our subsequent tests on emotion recognition establish that accurate statistical descriptions of several land- marks can be successfully used to improve expression analysis through tracking; we obtain as much as 10 per cent classification accuracy improvement through the use of statistical landmarking. This paper is structured as follows. In Section 2, an overview of the system is presented. Section 3 des- cribes related work in landmarking, followed by Sec- tion 4 that describes our statistical landmark locali- zation algorithm. In Section 5, we present the algo- rithms for model-based facial tracking and expression analysis. The experimental results are presented in Section 6, followed by our conclusions in Section 7. 2 Overview of the System In this section, we briefly describe the overall pipeline of the system (See Figure 1). The camera 1