S. Li et al. (Eds.): MMM 2013, Part I, LNCS 7732, pp. 368–379, 2013.
© Springer-Verlag Berlin Heidelberg 2013
Flexible Presentation of Videos
Based on Affective Content Analysis
Sicheng Zhao, Hongxun Yao, Xiaoshuai Sun, Xiaolei Jiang, and Pengfei Xu
School of Computer Science and Technology, Harbin Institute of Technology,
No.92, West Dazhi Street, Harbin, P.R. China, 150001
{zsc,h.yao,xiaoshuaisun,xljiang,pfxu}@hit.edu.cn
Abstract. The explosion of multimedia contents has resulted in a great demand
of video presentation. While most previous works focused on presenting certain
type of videos or summarizing videos by event detection, we propose a novel
method to present general videos of different genres based on affective content
analysis. We first extract rich audio-visual affective features and select
discriminative ones. Then we map effective features into corresponding
affective states in an improved categorical emotion space using hidden
conditional random fields (HCRFs). Finally we draw affective curves which tell
the types and intensities of emotions. With the curves and related affective
visualization techniques, we select the most affective shots and concatenate
them to construct affective video presentation with a flexible and changeable
type and length. Experiments on representative video database from the web
demonstrate the effectiveness of the proposed method.
Keywords: Video presentation, affective analysis, emotion space, HCRFs.
1 Introduction
The explosion of multimedia contents has resulted in a great demand of video
presentation. On one hand, viewers need to get a gist of video content, watch video
highlights due to time limit and then make the decision to view the entire video (e.g. a
movie) or not. On the other hand, video broadcast platforms, especially television
stations, have to check substantial videos and select legal and valuable ones to play,
which is a time-consuming and tedious task. Thus, effective video presentation
techniques can make video reviewers’ work more convenient and efficient.
Most previous works on content-based video presentation focused on certain type
of videos, such as sports videos, home videos, or summarizing videos by event
detection [1-5]. Liu et al. [1] proposed a novel flexible racquet sports video content
summarization framework, by combining the structure event detection method with
the highlight ranking algorithm. Zhao et al. [2] proposed a novel system of highlight
summarization in sports videos based on replay detection. Based on videos’ three
properties: emotional tone, local main character and global main character, Xiang and
Kankanhalli [3] employed affective analysis to automatically create adaptive
presentations from home videos for three types of social groups: family, acquaintance