Finite Element Modeling of Facial Deformation in Videos
for Computing Strain Pattern
Vasant Manohar, Matthew Shreve, Dmitry Goldgof, and Sudeep Sarkar
Computer Science & Engineering, University of South Florida
{vmanohar, mshreve, goldgof, sarkar}@cse.usf.edu
Abstract
We present a finite element modeling based approach to
compute strain patterns caused by facial deformation dur-
ing expressions in videos. A sparse motion field computed
through a robust optical flow method drives the FE model.
While the geometry of the model is generic, the material
constants associated with an individual’s facial skin are
learned at a coarse level sufficient for accurate strain map
computation. Experimental results using the computational
strategy presented in this paper emphasize the uniqueness
and stability of strain maps across adverse data conditions
(shadow lighting and face camouflage) making it a promis-
ing feature for image analysis tasks that can benefit from
such auxiliary information.
1. Introduction
Deformable modeling of facial soft tissues have found
use in application domains such as human-machine inter-
action for facial expression recognition [6]. More recently,
such modeling techniques have been used for tasks like age
estimation [9] and person identification [10, 11, 15]. Ex-
isting modeling approaches can be divided into two ma-
jor groups. Models based on solving continuum mechan-
ics problems under consideration of material properties
and other physical constraints are called physical models.
All other modeling techniques, even if they are related to
mathematical physics, are known as non-physical models.
Though physical models provide a highly accurate and ro-
bust solution strategy, the major problem with such ap-
proaches is that: (i) the observed physical phenomena can
be very complex and (ii) solving the underlying partial dif-
ferential equations (PDEs) requires substantial computa-
tional cost. The answers to these questions lie in: (i) finding
an adequate simplified model of the given problem covering
the essential observations and (ii) applying efficient numer-
ical techniques for solving the PDEs.
In this work, we use the strain pattern extracted from
non-rigid facial motion as a simplified and adequate way to
characterize the underlying material properties of facial soft
tissues. The proposed method has several unique features:
(i) Strain is related to the biomechanical properties of fa-
cial tissues that are unique for each individual; (ii) Strain
pattern of the face is less sensitive to illumination differ-
ences (between registered and query sequences) and face
camouflage because it remains stable as long as reliable fa-
cial deformations are captured; (iii) A finite element mod-
eling based method enforces regularization which mitigates
issues related to automatic motion estimation. Therefore,
the computational strategy is accurate and robust; (iv) Im-
ages or videos of facial deformations can be acquired with
a regular video camera and no special imaging equipment
is needed.
Existing work on face animation and recognition using
a highly accurate model take into account anatomical de-
tails of a face, such as bones, musculature, and skin tis-
sues [12, 13, 16]. However, a major challenge of using
a sophisticated anatomy-based model is the high compu-
tational complexity involved. An alternative is to extract
biomechanical information (that might be adequate for cer-
tain tasks) from images and videos without building a full-
scale model. Essa and Pentland [6] developed a finite el-
ement model to estimate visual muscle activations and to
generate motion-energy templates for expression analysis.
However, automatic identification of action units that es-
timate the muscle activations is still a topic of open re-
search. In our approach, which is also based on biome-
chanics, we go a step further by quantifying the soft tissue
properties through its elasticity and effectively representing
it by means of strain maps.
The study of facial strain requires high quality motion
data generated by robust tracking methods, an extensively
investigated subject in computer vision. The trend is to in-
tegrate various image cues and prior knowledge into a face
model [2, 5]. Such methods rely on a certain degree of user
intervention, for either model initialization or tracking guid-
ance. On the other hand, methods that avoid the use of
hand-labeled features and manual correspondence [1, 14]
required an extensive collection of training samples which
make them less scalable. Therefore, in this study, we
adopt an algorithm in its basic form – a robust optical flow
method.
Thus, the focus of this paper is on developing a robust
1
978-1-4244-2175-6/08/$25.00 ©2008 IEEE
Authorized licensed use limited to: University of South Florida. Downloaded on February 12, 2009 at 17:54 from IEEE Xplore. Restrictions apply.