MPEG-21 concepts for personalized consumption
of heterogeneous multimedia
G. Andreou
1
, K. Karpouzis
1
, I. Maglogiannis
2
, S. Kollias
1
1
Image, Video and Multimedia Systems Laboratory,
National Technical University of Athens, Greece
2
Dept. of Information and Communication Systems Engineering,
University of Aegean, Samos, Greece
{kkarpou, geand}@image.ntua.gr, imaglo@aegean.gr, stefanos@cs.ntua.gr
Abstract
In the framework of digital TV, viewers are cur-
rently being presented with technological develop-
ments and a plethora of content sources that create
diverse possibilities for both entertainment and profit.
Dynamic domains such as sports broadcasting create
even higher expectations, forcing broadcasting corpo-
rations and content providers to seek new ways of in-
tegrating and presenting enhanced content to their
customers, while modifying their production cycle as
less as possible. In this paper, we present an outline of
a system that integrates heterogeneous multimedia
content and performs adaptive packaging operations,
while preserving the intellectual property rights (IPR)
of the respective contributors and taking into account
the cumulative profiles of the viewers. To cater for this,
the MELISA system makes extensive use of concepts
included in the MPEG-21 ‘Multimedia Framework’
standard, such as Digital Item Adaptation (DIA).
1. Introduction to the MELISA system
MELISA is an acronym for Multi-Platform e-
Publishing for Leisure and Interactive Sports Advertis-
ing. The overall system architecture consists of the
sender side, which is responsible for content prepara-
tion, management and packaging and the receiver side,
a typically stand-alone machine that receives the pack-
aged content, decodes it, performs filtering operations
on its components and enhancements and handles in-
teractivity; a detailed description of the overall system
is provided in [1], while a process level diagram is
shown in Figure 1). The content preparation process in
the sender side includes tasks related to displaying in-
teractive advertisements related to athletes or teams
that participate in the particular event, bet offers during
a broadcast on a play-by-play basis, and template de-
sign for enhanced content at transmission time.
Figure 1: Overall process-level architecture of the
MELISA system
The main adaptation processes deal with the level
and shape of visual enhancements that will be pre-
sented to the end user, while in cases of parallel events,
content filtering may also take place in the receiver
side. Authoring visual enhancements takes place both
offline and online: offline authoring takes place before
the start of the sport event and consists of creating an
initial scene for the presentation, while online author-
ing consists of generating and transmitting real-time
encoded visual enhancements during the event.
In the MELISA system, visual enhancements are
encoded into MPEG-4 BIFS [2] (Binary Format for
Scenes), which is a binary representation of the spatio-
temporal positioning of audio-visual objects and their
behavior in response to interaction or other events.
Central to the MPEG-4 framework is the concept of
scene, which can be defined as what the user sees and
hears [3] and can be represented by a tree that de-
First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06)
0-7695-2692-6/06 $20.00 © 2006