This work is supported by national natural science foundation projects(60672100 , 60572068) and international corporation project(2005DFA10300) A Shot Clustering Based Algorithm for Scene Segmentation Xuejun Wang Jilin University Changchun China xjwang@jlu.edu.cn Shigang Wang Jilin University Changchun China Wangshigang@vip.s ina.com.cn Hexin Chen Jilin University Changchun China chx@jlu.edu.cn Moncef Gabbouj Tampere University of Technology FIN-33101 Tampere Moncef.Gabbouj@t ut.fi Abstract A scene segmentation method utilizing both visual features and motion features of video is presented in this paper. Not only the visual similarity but also the motion consistency of shots within a scene is considered in clustering shots into scenes. In addition, a method to merge the over-segmented scenes is presented also. And the experimental results show the effectiveness of the proposed algorithms. 1. Introduction With the rapid development of information technology, video data has become more and more important part of everyday life of human beings. So the content-based video analysis and retrieval has been developed to help people deal with the huge amount of video data. And the scene segmentation is the key problem for semantic video analysis. A scene is consisted of several shots that are semantically related and temporally closer. As a high level unit, a scene has two characteristics: the first is visual similarity and the second is time locality. That is, shots within the same scene are likely to be visually similar and will be closer to each other temporally. Note that visually dissimilar shots are also likely to belong to the same scene as long as they are not far from each other; on the other hand, two visually similar shots will not be grouped into the same scene if the temporal distance between them is greater than a threshold. As we will see, these two attributes are important elements of most scene segmentation algorithms. The first step of scene segmentation is shot detection. After the key frames are extracted and compared, the similar shots are merged to be a scene[1]. Then, the content of a scene can be denoted by several key frames which are simpler processed and needs less data. Now, Most scene segmentation algorithms employ shot similarity comparison to extract scenes[2]. And, color histograms of key frames are most frequently used to compute shot similarity. In addition to color, motion content is also an important feature of shots. The time-constrained clustering and the adaptive time grouping are two representative methods. In time- constrained clustering method[3], the shot similarity comparison is constrained within a fixed time window; the shot similarity of two shots is considered zero if their temporal distance exceeds the length of this time window. In adaptive time grouping method[4], the shot similarity is a varying function related the distance of two shots. Further more, a shot neighborhood coherence method is presented [5][6]. Firstly, divide the frames into several subblocks, then obtain the number of the most matched blocks, and the neighborhood coherence is defined in terms of the average smallest distance among the matched subblocks. According the coherence values, an overlapping links connecting similar shots is formed for scene segmentation. The disadvantages of the proposed methods above are that only the local color features and the DC elements of video are used. In this paper, by adopting the overlapping links, the proposed algorithm employs both global color features and motion features in shot similarity comparison. And a scene merging method to handle over-segmented scenes is presented also. The experimental results show the effectiveness of the proposed algorithms. 2. Shot clustering based scene segmentation algorithm 2.1 Shot boundary detection and key frame extraction We adopt the shot boundary detection method that utilizes macroblock type information in MPEG 2007 International Conference on Computational Intelligence and Security Workshops 0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.106 259 2007 International Conference on Computational Intelligence and Security Workshops 0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.106 259 2007 International Conference on Computational Intelligence and Security Workshops 0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.106 259 2007 International Conference on Computational Intelligence and Security Workshops 0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.106 259 2007 International Conference on Computational Intelligence and Security Workshops 0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.106 259 2007 International Conference on Computational Intelligence and Security Workshops 0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.106 259 2007 International Conference on Computational Intelligence and Security Workshops 0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.106 259 2007 International Conference on Computational Intelligence and Security Workshops 0-7695-3073-7/07 $25.00 © 2007 IEEE DOI 10.1109/CIS.Workshops.2007.106 259