Pattern Recognition 41 (2008) 2237 – 2252 www.elsevier.com/locate/pr Human action recognition using shape and CLG-motion flow from multi-view image sequences Mohiuddin Ahmad, Seong-Whan Lee Department of Computer Science and Engineering, Korea University, Anam-dong, Seongbuk-ku, Seoul 136-713, Republic of Korea Received 26 June 2007; received in revised form 6 November 2007; accepted 4 December 2007 Abstract In this paper, we present a method for human action recognition from multi-view image sequences that uses the combined motion and shape flow information with variability consideration. A combined local–global (CLG) optic flow is used to extract motion flow feature and invariant moments with flow deviations are used to extract the global shape flow feature from the image sequences. In our approach, human action is represented as a set of multidimensional CLG optic flow and shape flow feature vectors in the spatial–temporal action boundary. Actions are modeled by using a set of multidimensional HMMs for multiple views using the combined features, which enforce robust view-invariant operation. We recognize different human actions in daily life successfully in the indoor and outdoor environment using the maximum likelihood estimation approach. The results suggest robustness of the proposed method with respect to multiple views action recognition, scale and phase variations, and invariant analysis of silhouettes. 2007 Elsevier Ltd. All rights reserved. Keywords: Action recognition; Action matrix; Combined local–global (CLG) optic flow; Invariant Zernike moments; Multi-view image sequence; Multidimensional hidden Markov model (MDHMM) 1. Introduction Recognition of human actions from multiple views image sequences is very popular in the computer vision community since it has applications in video surveillance and monitor- ing, human–computer interactions, model-based compressions, augmented reality, and so on. The existing methods of human action recognition can be categorized depending on the image state properties, such as motion-based, shape-based, gradient- based, etc. Several human action recognition methods have been proposed in the last few decades. Detailed surveys can be found in Refs. [1–4], where different methodologies of human ac- tion recognition, human movement, etc., are discussed. Based on these reviews, researchers either use human body shape in- formation or motion information with or without body shape A preliminary version of the paper has been presented in the 7th IEEE International Conference on Automatic Face and Gesture Recognition, Southampton, UK, April 2006. Corresponding author. Tel.: +82 2 3290 3197; fax: +82 2 926 2168. E-mail addresses: mohi@image.korea.ac.kr (M. Ahmad), swlee@image.korea.ac.kr (S.-W. Lee). 0031-3203/$30.00 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2007.12.008 model for action recognition. Our approach can be considered as a combination of shape- and motion-based representation without using any prior body shape model. One standard approach for human action recognition is to extract a set of features from each image sequence frame, and use these features to train classifiers and to perform recogni- tion. Therefore, it is important to answer the following ques- tion. Which feature is robust to action recognition in critical conditions or varying environment? Usually, there is no rigid syntax and well-defined structure for human action recognition available. Moreover, there are several sources of variability [30] that can affect human action recognition, such as variation in speed, viewpoint, size and shape of performer, phase change of action, and so on, and the motion of the human body is non-rigid in nature. These characteristics make human action recognition a more challenging and sophisticated task. Consid- ering the above circumstances, we consider some issues that affect the development of models of actions and classifications, which are as follows: The trajectory of an action from different viewing directions is different; some of the body parts (part of hand, lower part