International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 6 Issue: 3 01 - 05 ______________________________________________________________________________________ 1 IJRITCC | March 2018, Available @ http://www.ijritcc.org _______________________________________________________________________________________ Video Classification:A Literature Survey Pravina Baraiya 1 Department of Information Technology, Shantilal Shah Engineering College, Bhavnagar, India palakbaraiya@gmail.com Asst. Prof. Disha Sanghani 2 Department of Information Technology, Shantilal Shah Engineering College, Bhavnagar, India dishasanghani83@yahoo.in Abstract—At present, so much videos are available from many resources. But viewers want video of their interest. So for users to find a video of interest work has started for video classification. Video Classification literature is presented in this paper. There are mainly three approaches by which process of video classification can be done. For video classification, features are derived from three different modalities: Audio, Text and Visual. From these features, classification has been done. At last, these different approaches are compared. Advantages and Dis-advantages of each approach/method are described in this paper with appropriate applications. Keywords-Video Classification; Audio-Based Approach; Text-Based Approach; Visual-Based Approach __________________________________________________*****_________________________________________________ I. INTRODUCTION Today technology is developing day by day So that people have access to a large amount of data on the Internet and television. A number of videos are increasing day by day so it is difficult for the viewers to manually find the video of interest from these large sources of video. Viewers are looking for a video within particular categories. For categorizing a large amount of video data, research work has begun on automatically classifying video. We mainly focus on reviewing various approaches to video classification. Various features from the video are taken for classifying the video. Video classification algorithm categorized various video by assigning the appropriate label to each video. (e.g. „News Video‟, „Cartoon Video‟ or „Sports Video‟) The rest of the paper is organized as follow. In section II, describe an approach that uses audio features for video classification. The approach that uses text features is described in section III. Section IV describes the approach that uses visual features. The comparison of various approaches for video classification is described in section V. In last section VI we provide conclusions. II. GENERAL BACKGROUND After studying literature survey of video classification, the approaches for video classification could be divided into four groups: Audio-based approaches, Text-based approaches, Visual-based approaches and fourth one are those that used various combination of audio, text and visual features. Features for video classification are drawn from these three different modalities: Audio, Text and Visual. Generally, in video classification some author classifying the entire video while some author focused on classifying video segment such as identifying „Sports Video‟, „News Video‟ or „Cartoon Video‟ [1]. Another algorithm classifying various sports video such as „golf‟, „Hockey‟ or „football‟ [2]. III. AUDIO BASED APPROACH Generally, audio clips are shorter in length and of small size. So if audio features need to be stored, it requires less space than other features. Another advantage of an audio feature is that they require less computational resource than visual features. For generating features from an audio signal, the audio signal is sampled at a certain rate. And then these sampled signal grouped together into frames. According to literature, audio-based approaches are used more than text-based approaches for video classification. Features from audio can be obtained from either the time domain or the frequency domain. Some commonly used low-level audio features are described as follows: 1. Time- Domain Features: The Root Mean Square (RMS) of a signal energy approximates loudness, which is calculated by taking series of windowed frames of sound and computing the square root of the sum of the squares of the windowed sample values [3]. The signal may be divided into various subbands and for each subbands energy measured separately. Different classes of sound fall into different subbands [4]. In the current frame, Zero Crossing Rate (ZCR) is the total number of sign changes for signal amplitude. Higher frequencies have higher zero crossing rate. For Silence frame generally, the loudness and Zero Crossing Rate are below thresholds. Normally music has less variability of the zero crossing rate. 2. Frequency-Domain Features: