A Novel Technique for Space-Time-Interest Point Detection and Description for Dance Video Classification Soumitra Samanta and Bhabatosh Chanda ECSU, Indian Statistical Institute, Kolkata, India {soumitra r,chanda}@isical.as.in Abstract. This paper presents a different type of video analysis problem which is cultural activity analysis in general and Indian Classical Dance (ICD) classification in particular. To tackle this problem we propose a novel method for space time interest point (STIP) detection and de- scription using differential geometry. Each video is represented by sparse code of STIP descriptors in each frame and then classification is done by a non-linear SVM with χ 2 -kernel. We have created a ICD dataset of six classes (Bharatanatyam, Kathak, Kuchipudi, Mohiniyattam, Ma- nipuri and Odissi) from YouTube and got on an average 68.18% accuracy which is better than the performance of state-of-the-art general human activity classification methods. We also have tested our algorithm on the benchmark datasets, like UCF sports and KTH, and the accuracy is comparable to that of the state-of-the-art. 1 Introduction During last two decades researchers are attracted towards the general human activity analysis: single actor activities (e.g., hand waving and running), mul- tiple actor activities (e.g., handshaking and punching) or human object inter- action (e.g., answering phone, get out of the car) [1,2], but not much towards the cultural activity analysis, like dance classification. This paper addresses a cultural activity analysis problem, more specifically, Indian classical dance classi- fication.The work is important not only for the retrieval but also for digitization of cultural heritage and analyze a particular dance language. In cultural point of view Indian classical dance, connected to entertainment as well as religion, has a long history. The earliest civilizations Mohenjo Daro and Harappa existed at the Indus valley in the Indian subcontinent in about 6000 B.C. [3]. At Mohenjo Daro there was a beautiful little statuette of dancing girl. Indian classical dance is the gesture of all the body parts. Due to occlusion, variation in clothing and different lighting conditions it is not possible to capture all the gestures of the dance with the help of the current state-of-the-art tech- nology. In general human activity analysis, local spatio-temporal feature based approach is the most successful one. Here first, space-time interest points are detected from video data. Then each detected point is described by local gra- dient and motion information. Then a vocabulary is learned by clustering the G. Bebis et al. (Eds.): ISVC 2013, Part I, LNCS 8033, pp. 507–516, 2013. c Springer-Verlag Berlin Heidelberg 2013