Motion-based Recognition of People in EigenGait Space Chiraz BenAbdelkader , Ross Cutler , and Larry Davis University of Maryland, College Park chiraz,lsd @umiacs.umd.edu Microsoft Research rcutler@microsoft.com Abstract A motion-based, correspondence-free technique for human gait recognition in monocular video is presented. We contend that the planar dynamics of a walking person are encoded in a 2D plot consisting of the pairwise image similarities of the sequence of im- ages of the person, and that gait recognition can be achieved via standard pattern classification of these plots. We use background modelling to track the person for a number of frames and extract a sequence of segmented images of the person. The self-similarity plot is computed via correlation of each pair of images in this se- quence. For recognition, the method applies Principal Component Analysis to reduce the dimensionality of the plots, then uses the k-nearest neighbor rule in this reduced space to classify an un- known person. This method is robust to tracking and segmentation errors, and to variation in clothing and background. It is also in- variant to small changes in camera viewpoint and walking speed. The method is tested on outdoor sequences of 44 people with 4 se- quences of each taken on two different days, and achieves a clas- sification rate of 77%. It is also tested on indoor sequences of 7 people walking on a treadmill, taken from 8 different viewpoints and on 7 different days. A classification rate of 78% is obtained for near-fronto-parallel views, and 65% on average over all view. 1 Introduction Recently, gait recognition has received growing interest within the computer vision community, due to its emergent importance as a biometric. The term gait recognition is typically used to signify the identification of individuals in image sequences ‘by the way they walk’. Gait classification is the recognition of different types of human locomotion, such as running, limping, hopping, etc. Be- cause human ambulation is one form of human movement, gait recognition is closely related to vision methods that detect, track and analyze human movement in general. Gait recognition research has largely been motivated by Jo- hansson’s experiments [19] and the ability of humans to perceive motion from Moving Light Displays (MLDs). In these experi- ments, human subjects were able to recognize the type of move- ment of a person solely from observing the 2D motion pattern gen- erated by light bulbs attached to the person. Similar experiments later showed some evidence that the identity of a familiar person (‘a friend’) [1], as well as the gender of the person [9] might be recognizable from MLDs, though in the latter case a recognition rate of 60% is hardly significantly better than chance (50%). Despite the agreement that humans can perceive motion from MLDs, there is still no consensus on how humans interpret this MLD-type stimuli (i.e. how it is they use it to achieve motion recognition). Two main theories exist: the first maintains that people use motion information in the MLDs to recover the 3D structure of the moving object (person), and subsequently use the structure for recognition; and the second theory states that motion information is directly used to recognize a motion [7]. The dynamics of gait can be fully characterized via the kine- matics of a handful of body landmarks such as limbs and joints [18]. Indeed, one method of motion-based recognition is to first explicitly extract the dynamics of points on a moving object (per- son). Consider a point on a moving object as a function of time . The dynamics of the point can be represented by the phase plot . Since we wish to recognize different types of motions (viz. gaits), it is important to know what can be determined from the projection of onto an image plane, . Under orthographic projec- tion, and if is constrained to planar motion, the object dy- namics are completely preserved up to a scalar factor. That is, the phase space for the point constructed from is identical (up to a scalar factor) to the phase space constructed from . How- ever, if the motion is not constrained to a plane, then the dynamics are not preserved. Under perspective projection, the dynamics of planar and arbitrary motion are in general not preserved. Fortunately, planar motion is an important class of motion, and includes “biological motion” [16]. In addition, if the person is sufficiently far from the camera, the camera projection becomes approximately orthographic (with scaling). In this case, and as- suming we can accurately track a point in the image plane, then we can completely reconstruct the phase space of the dynamic system (up to a scalar factor). The phase space can then be used directly to classify the object motion (e.g., [6]). In general, point correspondence is not always possible in real- istic image sequences (without the use of special markers), due to 1