MPEG-21 concepts for personalized consumption of heterogeneous multimedia G. Andreou 1 , K. Karpouzis 1 , I. Maglogiannis 2 , S. Kollias 1 1 Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, Greece 2 Dept. of Information and Communication Systems Engineering, University of Aegean, Samos, Greece {kkarpou, geand}@image.ntua.gr, imaglo@aegean.gr, stefanos@cs.ntua.gr Abstract In the framework of digital TV, viewers are cur- rently being presented with technological develop- ments and a plethora of content sources that create diverse possibilities for both entertainment and profit. Dynamic domains such as sports broadcasting create even higher expectations, forcing broadcasting corpo- rations and content providers to seek new ways of in- tegrating and presenting enhanced content to their customers, while modifying their production cycle as less as possible. In this paper, we present an outline of a system that integrates heterogeneous multimedia content and performs adaptive packaging operations, while preserving the intellectual property rights (IPR) of the respective contributors and taking into account the cumulative profiles of the viewers. To cater for this, the MELISA system makes extensive use of concepts included in the MPEG-21 ‘Multimedia Framework’ standard, such as Digital Item Adaptation (DIA). 1. Introduction to the MELISA system MELISA is an acronym for Multi-Platform e- Publishing for Leisure and Interactive Sports Advertis- ing. The overall system architecture consists of the sender side, which is responsible for content prepara- tion, management and packaging and the receiver side, a typically stand-alone machine that receives the pack- aged content, decodes it, performs filtering operations on its components and enhancements and handles in- teractivity; a detailed description of the overall system is provided in [1], while a process level diagram is shown in Figure 1). The content preparation process in the sender side includes tasks related to displaying in- teractive advertisements related to athletes or teams that participate in the particular event, bet offers during a broadcast on a play-by-play basis, and template de- sign for enhanced content at transmission time. Figure 1: Overall process-level architecture of the MELISA system The main adaptation processes deal with the level and shape of visual enhancements that will be pre- sented to the end user, while in cases of parallel events, content filtering may also take place in the receiver side. Authoring visual enhancements takes place both offline and online: offline authoring takes place before the start of the sport event and consists of creating an initial scene for the presentation, while online author- ing consists of generating and transmitting real-time encoded visual enhancements during the event. In the MELISA system, visual enhancements are encoded into MPEG-4 BIFS [2] (Binary Format for Scenes), which is a binary representation of the spatio- temporal positioning of audio-visual objects and their behavior in response to interaction or other events. Central to the MPEG-4 framework is the concept of scene, which can be defined as what the user sees and hears [3] and can be represented by a tree that de- First International Workshop on Semantic Media Adaptation and Personalization (SMAP'06) 0-7695-2692-6/06 $20.00 © 2006