48 International Journal for Modern Trends in Science and Technology
As per UGC guidelines an electronic bar code is provided to seure your paper
International Journal for Modern Trends in Science and Technology, 7(01): 48-53, 2021
Copyright © 2021 International Journal for Modern Trends in Science and Technology
ISSN: 2455-3778 online
DOI: https://doi.org/10.46501/IJMTST070111
Available online at: http://www.ijmtst.com/vol7issue01.html
Automatic Summarization of Cricket Highlights
using Audio Processing
Ritwik Baranwal
Information Technology, Maharaja Agrasen Institute of Technology, New Delhi, India
To Cite this Article
Ritwik Baranwal, “Automatic Summarization of Cricket Highlights using Audio Processing”, International Journal for
Modern Trends in Science and Technology, Vol. 07, Issue 01, January 2021, pp.- 48-53.
Article Info
Received on 22-November-2020, Revised on 18-December-2020, Accepted on 22-December-2020, Published on 29-December-2020.
The problem of automatic excitement detection in cricket videos is considered and applied for highlight
generation. This paper focuses on detecting exciting events in video using complementary information from
the audio and video domains. First, a method of audio and video elements separation is proposed.
Thereafter, the “level-of-excitement” is measured using features such as amplitude, and spectral center of
gravity extracted from the commentators speech’s amplitude to decide the threshold. Our experiments using
actual cricket videos show that these features are well correlated with human assessment of excitability.
Finally, audio/video information is fused according to time-order scenes which has “excitability” in order to
generate highlights of cricket. The techniques described in this paper are generic and applicable to a variety
of topic and video/acoustic domains.
KEYWORDS: Video Segmentation, Audio Chunks, Short Time Energy.
I. INTRODUCTION
This study focuses on the problem of identifying
exciting-events in multimedia content. Our
approach analyzes speech characteristics that
identify islands (or “hot-spots”) of strong emotion.
In general, the ability to automatically parse
multimedia content and tag “interesting events” is
important for many domains such as sports,
security, movies/TV shows, broadcast news, etc. A
number of technologies such as search,
summation, and mash-ups, can utilize “hot-spot”
information to enhance access to, as well as
navigation of content. For example, emotional
“hot-spots” within sports videos are very likely to be
“exciting” and this information can be used to
guide the process of automatically generating
highlights. This constitutes the motivation for this
work, where automatic highlights of cricket videos
are generated using emotional “hot-spot” detection
(or “exciting events” detection).
Researchers have utilized audio and video
streams to extract features that identify exciting
plays in sports videos. Among video-based
features, motion and density of cuts have been
found to be useful for detection[1] . On the other
hand, audio-based features have been derived
from both speech (generally commentators) and
background (generally audience), where
audience-events like cheering/applause as well as
the commentators speech characteristics have
proven to be useful [2,3]. While video-based
features tend to be more game-dependent,
audio-based feature detecting exciting plays.
Research in audio-based features have focused on
emotion analysis of the commentator‟s speech and
ABSTRACT