THE MULTIMEDIAN CONCERT-VIDEO BROWSER Ynze van Houten 1 , Umut Naci 2 , Bauke Freiburg 3 , Robbert Eggermont 2 , Sander Schuurman 3 , Danny Hollander 3 , Jaap Reitsma 1 , Maurice Markslag 1 , Justin Kniest 3 , Mettina Veenstra 1 & Alan Hanjalic 2 1 Telematica Instituut P.O.Box 589, 7500 AN Enschede, The Netherlands 2 Delft University of Technology, Information and Communication Theory Group Mekelweg 4, 2628 CD Delft, The Netherlands 3 Stichting Fabchannel Weteringschans 6-8, 1017 SG Amsterdam, The Netherlands ABSTRACT The MultimediaN concert-video browser demonstrates a video interaction environment for efficiently browsing video registrations of pop, rock and other music concerts. The exhibition displays the current state of the project for developing an advanced concert-video browser in 2007. Three demos are provided: 1) a high-level content analysis methodology for modeling the “experience” of the concert at its different stages, and for automatically detecting and identifying semantically coherent temporal segments in concert videos, 2) a general-purpose video editor that associates semantic descriptions with the video segments using both manual and automatic inputs, and a video browser that applies ideas from information foraging theory and demonstrates patch-based video browsing, 3) the Fabplayer, specifically designed for patch-based browsing of concert videos by a dedicated user-group, making use of the results of automatic concert-video segmentation. 1. INTRODUCTION The real design issue regarding video interaction is greater efficiency in finding useful, interesting, or appealing video segments. The MultimediaN 1 concert-video browser demonstrates a video interaction environment for efficiently browsing video registrations of pop concerts as performed at the Dutch concert halls Paradiso and Melkweg, available at the Fabchannel website [1]. The exhibition displays the current state of the project for developing an advanced concert-video browser in 2007. 1 MultimediaN is a Dutch national research program in the field of multimedia, 2004-2009 This includes the development of an automatic video content analysis algorithm for providing non-linear access to semantically coherent video segments corresponding to their content and the “experience” they elicit, a multi- purpose patch-based video editing and browsing environment, and a mock-up of a concert-video browser with functionality and design serving a dedicated user- group. Current interaction with concert videos – as in the old design of the Fabplayer - is mostly limited to selecting and playing a concert. This project aims at much more advanced user interaction, where users will be able to interact with smaller semantic units than complete concerts. These units are not limited to songs but also include smaller units with a particular affective (e.g., excitement) and cognitive (e.g. solo, vocals, instrumental) content. Detection of these semantically coherent temporal segments is preferably performed automatically. Next, the segments are semantically described, which may include attributes like who is performing, which instruments are played, what the performers look like (e.g. clothing), which texts are uttered etc. Thus, a set of video patches can be created. Video patches are collections of fragments sharing a certain attribute. For example, a video patch can be the collection of all fragments with guitar solos, or the collection of all fragments with stage divers, or the collection of all songs from a specific album, etc. The attributes are not randomly chosen, but will be acquired from end-users via surveys. Video fragments can come from one concert video, or from the whole collection of concert videos. The user can browse the layer of patches and have a rich interaction with the underlying concert-video database. The browsing activity itself can be the goal, or the user can search music fragments to create a playlist with favourite parts from one concert or from several concerts, which can then be replayed or shared with other users. 0-7803-9332-5/05/$20.00 ©2005 IEEE