Gestures for Natural Interaction with Video
Nesrine Fourati, Emmanuel Marilly
Alcatel-Lucent Bell Labs France, Centre de Villarceaux - Route de Villejust, 91620 Nozay - France
ABSTRACT
In the context of immersive communications, we propose a method enabling natural video interactions through hand
gesture recognition between users and a video meeting system. The interaction can be performed either by the mean of
hand posture recognition or by the dynamic hand gesture recognition according to user’s preference. The statistical
approach adopted in our work to recognize hand posture has shown accurate results for both performance evaluation and
user test. Besides, the combination of data-mining fields and signal processing for dynamic gestures recognition allows
us to define the appropriate rules and to reduce the confusion between gestures. Furthermore, the hand region extraction
is based on both skin color and background subtraction to avoid the detection of static objects that have a similar skin
color. Finally, the collected user’s feedback allows as to evaluate our approach from the user’s point of view and to
define the limitations that will be discussed in our perspectives in order to improve the results.
Keywords: Gesture, Posture, Recognition, Video Interactions, User’s Feedback.
1. INTRODUCTION
There is a clear trend to change the current mode of telecommunication toward Immersive Communications. Immersive
Communication means that users want to communicate, interact, share and collaborate at distance using sensorial and
attentional immersions. As described in [1], the willingness to change the current mode of telecommunication comes
from Immersive Communication that enables natural experiences and interactions among people, objects, and
environments as if they were collocated, although they may be geographically distributed. Based on this approach and
definition, we propose a method enabling natural video interactions (i.e. hand gesture interactions) between users and a
video meeting system.
1.1 Multimodal Interaction & Gesture Models
Although humans can identify and recognize gestures easily, the implementation of an automatic approach performing
these tasks is a challenge due to many constraints and the wide semantic gap [3] between human and machine abilities to
recognize visual content.
Figure 1 - Hand postures
In addition to the common problems such as luminosity or background complexity, the hand gesture recognition process
has to differentiate the unintentional hand movements from the other hand gestures. A gestural taxonomy which explains
the difference between all classes of hand gestures can be found in [4]. Besides, due to the unlimited number of all
possible hand gestures, their categorization into different classes without an overlap can be considered as a difficult task.
Pointer
Record
Palm
Pause
Clenched fingers
Stop
Thumb and forefinger
Replay
Visual Information Processing and Communication III, edited by Amir Said, Onur G. Guleryuz, Robert L. Stevenson,
Proc. of SPIE-IS&T Electronic Imaging Vol. 8305, 83050L · © 2012 SPIE-IS&T
CCC code: 0277-786X/12/$18 · doi: 10.1117/12.906681
Proc. of SPIE-IS&T Vol. 8305 83050L-1