Human Activity Recognition Using a Dynamic Texture Based Method Vili Kellokumpu, Guoying Zhao and Matti Pietikäinen Machine Vision Group University of Oulu, P.O. Box 4500, Finland {kello,gyzhao,mkp}@ee.oulu.fi Abstract We present a novel approach for human activity recognition. The method uses dynamic texture descriptors to describe human movements in a spatiotemporal way. The same features are also used for human detection, which makes our whole approach computationally simple. Following recent trends in computer vision research, our method works on image data rather than silhouettes. We test our method on a publicly available dataset and compare our result to the state of the art methods. 1 Introduction Human activity recognition has become an important research topic in computer vision in recent years. It has gained a lot of attention because of its important application domains like video indexing, surveillance, human computer interaction, sport video analysis, intelligent environments etc. All these application domains do have their own demands, but in general, algorithms must be able to detect and recognize various activities in real time. Also as people look different and move differently, the designed algorithms must be able to handle variations in performing activities and handle various kinds of environments. Many approaches for human activity recognition have been proposed in the literature [4, 12]. Recently there has been a lot of attention towards analysing human motions in spatiotemporal space instead of analysing each frame of the data separately. Blank et al. [1] used silhouettes to construct a space time volume and used the properties of the solution to the Poisson equation for activity recognition. Ke et al. [7] build a cascade of filters based on volumetric features to detect and recognize human actions. Shechtman and Irani [19] used a correlation based method in 3d whereas Kobyashi and Otsu [10] used Cubic Higher-order Local Autocorrelation to describe human movements. Interest point based methods that have been quite popular in object recognition have also found their way to activity recognition. Laptev et al. [11] extended the Harris detector into space time interest points and detected local structures that have significant local variation in both space and time. The representation was later applied to human action recognition using SVM [17]. Dollár et al. [3] described interest points with cuboids, whereas Niebles and Fei-Fei [13] used a collection of spatial and spatial temporal features extracted in static and dynamic interest points.