20th Iranian Conference on Electrical Engineering, (ICEE2012), May 15-17, Tehran, Iran
Video Summarization Using Fuzzy C-Means Clustering
Ebrahim Asadi*, Nasrolla Moghadam Charkari**
* Tarbiat Modares University , e.asadi@modares.ac.ir
** Tarbiat Modares University, moghadam@modares.ac.ir
Abstract: The rapid growth of digital world and computer net
working are contributing to an enormous and continuous grow
ing of video content. Despite the greatly growth in digital video
technologies. the capabilities of users to manipulate, interact
with and manage videos are still far behind what users can
achieve with other ypes of media such as text or images. This is
primarily because of temporal and multi-modal nature of video
and the size of the associated medium. Between research topics,
video summarization is an important one that improves faster
browsing of large video collections and also more eicient con
tent indexing and access. We also introduce a new keyframe
extraction system that produces static video summaries, using
fuzzy c-means clustering. We choose frame with mximum
membership grade for any clusters as keyframe. Number of
clusters estimated with a simple metho. The summaries that
produced by users are used for evaluation. These summaries
are compared both to our approach and also to a number of
other techniques in the literature. Experimental results show
that the proposed solution provided static video summaries with
more relevance with original video and user's intention. Also
our method is considerable that gives high accuracy with low
error rate.
Keywords: Video summarization, Key rame extraction,
Fuzzy C-Means, Clustering.
1. Introduction
Rapid development of computation, communications,
and storage inrastructures, are contributing to an enor
mous and steadily growing availability of video content.
Despite the enormous investments in digital video tech
nologies, the capabilities of an average user to manipu
late, interact with and manage videos are still far behind
what average users can achieve with other types of media
such as text or images. This is mainly due to the temporal
and multi-modal nature of video and the size of the asso
ciated medium.
To ind items of interest in this ocean of multimedia
content, users have adopted services such as electronic
program guides, TV web-portals, and web search engines
that aggregate information relevant to the users' queries
and allow them to ind easily the content they are looking
for.
978-1-4673-1148-9112/$3l.00 ©2012 IEEE 690
However, while content offer and availability for the
average users have increased enormously, ree time for
consuming content has not increased much. The key
problem of each consumer is to make eicient use of the
ree-time available for enjoying content.
Automatic video summarization aims at creating
eicient representations of video for facilitating brows
ing, search and, more generically, management of digital
multimedia content. Automatically generated summaries
can support users in navigating large video archives and
in taking decisions more eiciently regarding selecting,
consuming, sharing, or deleting content.
Video summarization is a technic to produce a still or
moving sequence of images rom original video as a
summary for that video. There are two main video sum
marization technics [1]: static video summarization (key
rame extraction) and dynamic video summarization
(video skimming). Static video summaries consist of a set
of rames (keyrames) extracted rom the original video,
while dynamic video summaries are a video clip consists
of a collection of video segments (and corresponding
audio) extracted rom the original video.
Video skim include audio and motion that consist of
more information. In addition, it is often more entertain
ing and interesting to watch a skim than a slide show of
keyframes [2]. On the other hand, keyrame sets are not
restricted by any timing or synchronization issues and,
therefore, they offer much more lexibility in terms of
organization for browsing and navigation purposes, in
comparison to strict sequential display of video skims [3-
6]. Also our proposed method produces a static video
summary.
Various approaches have been proposed in the litera
ture, most of them based on clustering techniques [7-11].
The basic idea is clustering together similar rames/shots
and then extraction some rames (generally one rame)
per cluster as key rames. These methods are different in
features (e.g., color histogram, luminance, and motion
vector) and clustering algorithms (e.g., k-means, hierar
chical).