Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition Ruiduo Yang 1 , Sudeep Sarkar 1 , and Barbara Loeding 2 1 Computer Science and Engineering 2 Special Education University of South Florida University of South Florida Tampa, FL 33620, USA Lakeland, FL 33603 {ryang, sarkar}@csee.usf.edu bloeding@lklnd.usf.edu Abstract One of the hard problems in automated sign language recognition is the movement epenthesis (me) problem. Movement epenthesis is the gesture movement that bridges two consecutive signs. This effect can be over a long du- ration and involve variations in hand shape, position, and movement, making it hard to explicitly model these inter- vening segments. This creates a problem when trying to match individual signs to full sign sentences since for many chunks of the sentence, corresponding to these mes, we do not have models. We present an approach based on version of a dynamic programming framework, called Level Build- ing, to simulataneously segment and match signs to contin- uous sign language sentences in the presence of movement epenthesis (me). We enhance the classical Level Build- ing framework so that it can accomodate me labels for which we do not have explicit models. This enhanced Level Building algroithm is then coupled with a trigram grammar model to optimally segment and label sign language sen- tences. We demonstrate the efficiency of the algorithm using a single view video dataset of continuous sign language sen- tences. We obtain 83% word level recognition rate with the enhanced Level Building approach, as opposed to a 20% recognition rate using a classical Level Building framework on the same dataset. The proposed approach is novel since it does not need explicit models for movement epenthesis. 1. Introduction The task of sign language recognition offers an unique opportunity for the development of motion recognition al- gorithms for human computer interfaces. In particular, it lets us easily get beyond just single gestures or signs. In practice HCI would involve composition of individual ges- tures just as sign sentences are compositions of individual signs. When signs appear in sentence contexts, variations Figure 1. The first frame is the end of sign:”GATE”, the last frame is the start frame of ”WHERE”, in between there are several tran- sition frames which actually has no meaning and is known to be the me segment. appear; sentences are not the concatenation of individual signs. In the phonological processes in sign language, some- times a movement segment needs to be added between two consecutive signs [10]. This is called movement epenthesis (me). Fig. 1 shows an example of me frames. These frames do not correspond to any sign and can involve change in hand shape, movement, and can be over many frames some- times equal in length to actual signs. Given N possible signs there would be O(N 2 ) possible types of movement epenthesis, which make it computationally burdensome to explicitly model all possible mes. There are also other types of phonological processes where the appearance of a sign is affected by the previous and successive signs; these processes include hold deletion, metathesis and assimila- tion. These are analogous to the “coarticulation” issue in speech [4]. There is no correlate for “movement epenthe- sis” in speech. Movement epenthesis occurs very frequently between consecutive signs, unlike the “coarticulation”-like processes, which only occur in a small number of signs [2]. We concur with Sylvie and Ranganath [2], that the move- ment epenthesis should be dealt with first. The new match- ing algorithm in this paper is a contribution in that direction. As one can easily garner from an excellent review of sign language recognition [2] Hidden Markov Models (HMM) [8] with statistical grammar modeling or Dynamic Time Warping (DTW) [7] approach are the most common ones. Both originate from the speech recognition commu- nity, where it has been found that the performance of both approaches are simlar. Since movement epenthesis is not 1-4244-1180-7/07/$25.00 ©2007 IEEE