HMM Based Classification of Sports Videos Using Color Feature Josh Hanna , Fatma Patlar , Akhan Akbulut , Engin Mendi and Coskun Bayrak ∗† Computer Science Department, University of Arkansas at Little Rock, AR, USA {jjhanna | esmendi | cxbayrak}@ualr.edu Computer Engineering Department, Istanbul Kultur University, Istanbul, Turkey {f .patlar | a.akbulut | c.bayrak}@iku.edu.tr Abstract—Video content classification is an important element for efficient access and retrieval of video in any media content management system. Categorizing the video segments can help to provide convenience and ease in accessing the relevant video content without sequential scanning. In this paper, we present a Hidden Markov Model (HMM) based classification technique for sports videos. Speed of color changes is computed for each video frame and used as observation sequences in HMM for classification. Experiments using more than 1 hour of 18 training and 18 testing sports videos of 3 predefined genres (golf, hockey and football) give very satisfactory classification accuracy. I. I NTRODUCTION Multimedia content classification refers to the computerized apprehension of the semantic meanings of a multimedia file or document. With the increase in digital video contents, efficient techniques for classification of videos according to their contents have become more important. Applications such as digital libraries, e-Learning, video-on-demand, digital video broadcast and interactive TV generate and use large collections of video data. For an effective use of these video data, all digital contents must be classified based on their categories. There has been a growing demand for content based automatic video classification for the web multimedia administration. For this reason, numerous research is being done for such systems. Several content based classification systems for organizing and managing video databases have been recently proposed. Classification of the videos into predefined genres is the most prefered. Basic working principle for this type of applications is classical pattern classification algorithm [1]. First, features like color, sound or video text are extracted from the videos, then passed from a reduction process to be ready for the classification. In [2], nearest neighbor clustering is used for video classi- fication. A more complex framework is represented as fully automatic and computationally efficient framework for analy- sis and summarization of soccer videos using cinematic and object-based features. This model uses cinematic and object- based features for semantic analysis of sports videos [3]. Extracted features are commonly classified with HMM for segmenting video contents. Boreczky and Lynn [4] used three types of features for video segmentation; the standard histogram difference, an audio distance measure and an es- timate of object motion between two adjacent frames. Other implementations operated object color and texture features to generate highlights for soccer videos [5]. Zhu [6] classified news stories using features obtained from closed captions. This work is an example for video classification using only text features. Liu [7] [8] [9] used audio features such as non-silence ratio, volume standard deviation, volume dynamic range, pitch standard deviation, voice/music ratio, noise/unvoice ratio, fre- quency centroid and frequency bandwidth. Those features are extracted from the segments of the sampled audio signals and used in one-class-one-network structure for classification. In this paper, we present a video classification approach based on HMM for video content classification using color feature. Our aim is to categorize the input video from the predefined groups: golf, hockey and football. The rest of this paper is organized as follows. Section 2 presents the concept of applying HMM for video classification and our feature extraction details. Experimental results and conclusion are given in Section 3 and Section 4, respectively. II. HMM FOR VIDEO CLASSIFICATION HMM is a popular technique widely used in signal process- ing. HMMs are a formal foundation for making probabilistic models of linear sequence “labeling” problems [10] and they are especially known for their applications in temporal pattern recognition such as speech, handwriting, gesture recognition, part-of-speech tagging, musical score following, partial dis- charges and bioinformatics. They are mostly used for classi- fying sequential data to capture the temporal relationships of the extracted features. In our research, we extended it to video analysis and classification. A. Definition of HMM In an HMM, there are a finite number of states, each of which is associated with a transition probability to the others. Everytime, the HMM stays in one definite state. The states at time t is directly influenced by the state at time t - 1. After each translation from one state to another, an output observation is generated based on an observation probability distribution associated with the current states [10]. Formally, a HMM is defined to be: HMM = {N,B, Π} where N is the set of states, B is the number of observation symbols and Π is set of state transition probabilities.