FACIAL ACTION RECOGNITION IN FACE PROFILE IMAGE SEQUENCES Maja Pantic Delft University of Technology ITS / Mediamatics Dept. Delft, the Netherlands M.Pantic@cs.tudelft.nl Ioannis Patras University of Amsterdam Computer Science Dept. Amsterdam, the Netherlands yiannis@science.uva.nl Leon Rothkrantz Delft University of Technology ITS / Mediamatics Dept. Delft, the Netherlands L.J.M.Rothkrantz@cs.tudelft.nl ABSTRACT A robust way to discern facial gestures in images of faces, insensitive to scale, pose, and occlusion, is still the key research challenge in the automatic facial-expression analysis domain. A practical method recognized as the most promising one for addressing this problem is through a facial-gesture analysis of multiple views of the face. Yet, current systems for automatic facial-gesture analysis utilize mainly portraits or nearly frontal- views of faces. To advance the existing technological framework upon which research on automatic facial-gesture analysis from multiple facial views can be based, we developed an automatic system as to analyze subtle changes in facial expressions based on profile-contour fiducial points in a profile-view video. A probabilistic classification method based on statistical modeling of the color and motion properties of the profile in the scene is proposed for tracking the profile face. From the segmented profile face, we extract the profile contour and from it, we extract 10 profile-contour fiducial points. Based on these, 20 individual facial muscle actions occurring alone or in a combination are recognized by a rule-based method. A recognition rate of 85% is achieved. 1. INTRODUCTION The research presented here pertains to the problem of automatic facial expression analysis. Our major impulse to investigate this problem comes from the significance of information that the face provides about human behavior. Facial gestures (facial muscle activity underlying a facial expression) regulate our social interactions [1]: they clarify whether our current focus of attention (a person, an object or what has been said) is important, funny or unpleasant for us. They are our direct, naturally preeminent means of communicating emotions [1, 2]. Automatic analyzers of subtle facial changes, therefore, seem to have a natural place in various vision systems including the automated tools for psychological research, lip reading, bimodal speech analysis, affective computing, videoconferencing, face and visual speech synthesis, and human- behavior-aware next-generation interfaces. Within our research, we first investigated whether and to which extent human facial gestures could be recognized automatically. This paper presents preliminary results of our research on automatic recognition of facial gestures from face-profile images. Most approaches to automatic facial expression analysis attempt to recognize a small set of prototypic emotional facial expressions, i.e., fear, sadness, disgust, anger, surprise and happiness [3]. This practice may follow from the work of Darwin and more recently Ekman [2], who suggested that basic emotions have corresponding prototypic expression. In everyday life, however, such prototypic expressions occur relatively infrequently; emotions are displayed more often by subtle changes in one or few discrete facial features, such as raising the eyebrows in surprise [1]. To detect such subtlety of human emotion, automatic recognition of facial gestures (i.e., fine-grained changes in facial expression) is needed. Facial gestures are anatomically related to contractions of facial muscles [4]. Contractions of facial muscles produce changes in both the direction and magnitude of the motion on the skin surface and in the shape and location of the permanent facial features (eyes, mouth, etc.). To reason about shown facial gestures, the face, its features and their current appearance should be detected first. A problematic issue here is that of scale, pose, and occlusion: rigid head and body movements of the observed person usually cause changes in the viewing angle and the visibility of the tracked face and its features. As noted in [6], perhaps the most promising method for addressing this problem is through the use of multiple cameras yielding multiple views of the face and its features. To date, nonetheless, the works on automatic facial gestures analysis have avoided dealing with facial views other than a frontal one: portraits (e.g., [5, 7]) or nearly-frontal views of faces (e.g., [8, 9]) constitute the input data processed by the existing systems. For an exhaustive review on the past attempts to address the problems of automatic facial gesture recognition in frontal and nearly-frontal views of faces, readers are referred to [3]. From several methods for recognition of facial gestures based on visually observable facial muscular activity, the FACS system [4] is the most commonly used in the psychological research. Following this trend, all of the existing methods for automatic facial gesture analysis, including the method proposed here, interpret the facial display information in terms of the facial action units (AUs) of the FACS system [3, 5]. Yet none automatic system is capable of encoding the full range of facial mimics, i.e., none is capable of recognizing all 44 AUs that account for the changes in facial display. From the previous works, the automatic facial mimics analyzers presented in [9] and [7] perform the best in this aspect: they code 16 and, respectively, 27 AUs occurring alone or in a combination in frontal-view face images. The research reported here addresses the problem of automatic AU coding from face profile image sequences. It was undertaken with two motivations: 1. In a frontal view of the face, facial gestures such as showing the tongue (AU19) or pushing the jaw forwards (AU29) represent out-plane non-rigid facial movements which are difficult to detect [7, 8, 9]. Such facial gestures are clearly observable in a profile-view of the face. 2. A basic understanding of how to achieve automatic facial gesture analysis from human face profiles is necessary for the 0-7803-7304-9/02/$17.00 ©2002 IEEE