Enhanced Shot-Based Video Adaptation using
MPEG-21 generic Bitstream Syntax Schema
Sarah De Bruyne, Davy De Schrijver, Wesley De Neve, Davy Van Deursen, Rik Van de Walle
Department of Electronics and Information Systems - Multimedia Lab - Ghent University - IBBT
Gaston Crommenlaan 8 bus 201, B-9050 Ledeberg-Ghent, Belgium
Email: {sarah.debruyne, davy.deschrijver, wesley.deneve, davy.vandeursen, rik.vandewalle}@ugent.be
Abstract— Semantic video adaptation takes into account the
relevance of the different fragments of the video content in order
to create a tailored video stream based on the user’s preferences.
As a shot can be considered as the smallest semantic unit in
a video sequence, metadata can be added to each shot using
MPEG-7 descriptions. Based on these metadata and the user’s
preferences, the original bitstream can be adapted in order to
obtain the desired fragments. MPEG-21 DIA offers a tool, gBS
Schema, for exposing the high-level structure of a binary resource
as an XML description. In this paper, shot information is inserted
in these descriptions to create a link between metadata and
semantic video adaptation. Furthermore, this paper proposes
to keep the structure of these descriptions format-agnostic. As
a result, only one generic transformation style sheet has to be
implemented to support shot-based video adaptation of sequences
compliant with different video specifications. Special attention is
payed to sequences coded with the H.264/AVC standard as this
specification contains several interesting features important for
shot-based video adaptation.
I. I NTRODUCTION
As multimedia has proliferated over the past years, many
new technologies have been developed to establish the delivery
and consumption of multimedia content. Users began to expect
that this content can easily be accessed according to their own
preferences. Therefore, the delivered content must be tailored
to the user’s characteristics and preferences, as well as to the
capacities of the terminals and networks.
Video adaptation [1] is an emerging field of interest that in-
cludes techniques responding to the above challenges. Several
adaptation strategies can be identified, either operating on a
semantic level (e.g., removal of violent scenes or extraction
of semantic highlights), at a structural level (e.g., key frame
extraction), or at signal-processing level (e.g., transcoding).
To adapt a video sequence, MPEG-21 Digital Item Adapta-
tion (DIA) [2] offers a tool, generic Bitstream Syntax Schema
(gBS Schema), to describe the high-level structure of the
bitstream using the Extensible Markup Language (XML). The
resulting XML document is called a generic Bitstream Syntax
Description (gBSD) which makes it possible to describe the
bitstream in a coding format-agnostic manner.
This paper concentrates on the link between metadata and
format-agnostic semantic video adaptation by making use
of gBS Schema. This way, metadata and semantic video
adaptation can be coupled in an elegant manner. Therefore,
shot information is inserted in the gBSDs indicating to which
shot each frame belongs. The selection of the desired shots
can be obtained by using MPEG-7 descriptions containing
metadata about the different shots. Once the desired shots are
indicated, a generic transformation style sheet is used to obtain
the desired adapted sequence by linking the desired shots to
the shot information available in the gBSD. Special attention
needs to be payed to the extraction of the desired fragments
as the adapted bitstream needs to remain compliant with the
corresponding specification.
Related work includes a semantic adaptation framework
for the generation of semantic metadata and the semantic
adaptation of video on a frame basis using gBS Schema [3].
Furhtermore, [4] and [5] focus on video adaptation using
gBS Schema. In particular, an example of a gBSD is given
which is used to classify fragments of a video using semantic
information.
This paper is organized as follows. The following section
introduces the main enabling technologies and concepts, while
Sect. III discusses the shot-based adaptation process. Experi-
mental results are given in Sect. IV.
II. ENABLING TECHNOLOGIES AND CONCEPTS
A. gBSD-driven Content Adaptation
MPEG-21 gBS Schema is a tool of part 7 (Digital Item
Adaptation, DIA) of the MPEG-21 specification used to
facilitate content adaptation [4], [5]. To realize this, gBS
Schema defines a framework that enables the description of
the high-level structure of a bitstream in XML, resulting in
a Bitstream Syntax Description (BSDs). This description is
not meant to describe the bitstream on a bit-per-bit basis, but
rather addresses its high-level structure. In Fig. 1, a global
architecture for a BSD-based content adaptation framework is
given. First, a BSD of the high-level structure of the bitstream
is generated. This BSD is then adapted according to the user’s
preferences by means of a transformation language. Finally,
the adapted BSD becomes input to an adaptation module
responsible for the generation the corresponding bitstream.
gBS Schema uses only one generic schema to describe the
structure of a generic BSD (gBSD), making the syntax of the
gBSD generic and codec-independent. Therefore, the regener-
ation of the adapted bitstream can be achieved without the need
of codec-specific schemas. Furthermore, this schema makes it
possible to describe the bitstream in a hierarchical fashion
and provides semantically meaningful marking of syntactical
380
Proceedings of the 2007 IEEE Symposium on Computational
Intelligence in Image and Signal Processing (CIISP 2007)
1-4244-0707-9/07/$25.00 ©2007 IEEE