PatternRecognition35(2002)2073–2093 www.elsevier.com/locate/patcog Recovering facial pose with the EM algorithm KwangNamChoi 1 , MarcoCarcassoni,EdwinR.Hancock ∗ Department of Computer Science, University of York, York, Y010 5DD, UK Received 24 August 2000; received in revised form 28 December 2000; accepted 12 January 2001 Abstract Thispaperdescribeshow3Dfacialposemaybeestimatedbyttingatemplateto2Dfeaturelocations.Thettingprocess isrealisedasprojectingthecontrolpointsofa3Dtemplateontothe2Dfeaturelocationsunderorthographicprojection.The parametersoftheorthographicprojectionareiterativelyestimatedusingtheEMalgorithm.Themethodisevaluatedonboth contrived data with known ground-truth together with some more naturalistic imagery. These experiments reveal that under favourableconditionsthealgorithmcanestimatefacialpitchtowithin3 ◦ . ? 2002PatternRecognitionSociety.Publishedby ElsevierScienceLtd.Allrightsreserved. Keywords: Facial pose estimation; Facial feature detection; EM algorithm 1. Introduction Facial pose estimation is a key task for many practical computervisionapplications.Specicexamplesincludevi- sual surveillance, camera assisted user interfaces [1] and user identication or verication [2]. In essence, the prob- lemrevolvesaroundthettingofageneric3Dtemplateto segmentedfacialfeatureslocatedina2Dimage.Thecom- plexityoftheproblemdependscriticallyonwhetherornot thefeaturesarelabelled,i.e.whetherthemodel-datacorre- spondencesareknownapriori.Inthecaseoflabelleddata with known correspondences, then the complexity of the search-space is considerably reduced. For unlabelled data withunknownfeaturecorrespondences,thengreatcaremust betakentorendertheregistrationprocesscomputationally tractable. Once the template has been tted to the feature data, then 3D pose parameters may be used to manipulate theface.ForinstanceAvidanandShashuahaveusedsuch informationforviewsynthesis[3].Viewedinthiswaypose ∗ Corresponding author. Tel.: +44-1904-43-3374; fax: +44- 1904-43-2767. E-mail address: erh@cs.york.ac.uk (E.R. Hancock). 1 The author is now with Information and Telecommunication Research Institute, Chung-Ang University, Seoul 156-756, South Korea. estimationmayberegardedasanessentialpre-requisitefor detailedfacialverication. Therehavebeenmanyattemptsatecientlyrecovering the3Dfacialpose.Mostoftheseusedomainspeciccues to limit the search-space of the 3D model. Typically, the generic facial template must be translated, scaled and sub- jectedtoEulerianrotation.Oneofthemostpowerfulcues is to use the baseline of the eyes to estimate the gaze di- rection[4].Inthiswaythetilt-directionmaybedetermined priortorotationestimation.Basedontheknownratioofthe inter-eyeseparationandthedistancetootheraxialfeatures suchasthetipofthenoseorthelips,therotationanglemay alsobeestimated.Infact,theideaofusingdomain-specic cuestorestrictthesearch-spaceisquitegenericandhasbeen usedinanumberof3Dobjectregistrationapplications.One notable example is the tting of 3D models to 2D images ofvehicles[5]. 1.1. Related literature The facial pose estimation problem can be regarded as having two ingredients. The rst of these is a means of locatingfacialfeatures.Thesecondistheuseofthesefea- turestoestimateposeangles.Inthissubsectionwereview theliteratureinthesetwoareas. 0031-3203/02/$22.00 ? 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved. PII:S0031-3203(01)00173-X