QPSR of the numediart research program, Vol. 2, No. 1, March 2009 MEDIACYCLE: BROWSING AND PERFORMING WITH SOUND AND IMAGE LIBRARIES Xavier Siebert 1 , Stéphane Dupont 2 , Philippe Fortemps 1 , Damien Tardieu 2 1 Laboratoire de Mathématique et Recherche Opérationnelle (MathRo), Faculté Polytechnique de Mons (FPMs), Belgique 2 Laboratoire de Théorie des Circuits et Traitement du Signal (TCTS), Faculté Polytechnique de Mons (FPMs), Belgique ABSTRACT The MediaCycle project, as part of numediart’s HyForge re- search axis, aims at developing a novel browsing environment for multimedia (sound, images, videos) databases, that offers an al- ternative to conventional search-by-query. The databases are or- ganized so that users can conveniently retrieve the items that they need. This project is an extension of the software developed in Au- dioCycle (numediart #4.1), from sound to images. Extensions to video databases will be investigated in upcoming projects. KEYWORDS MediaCycle, MultiMedia Databases, Content-based Navigation, Image Features 1. INTRODUCTION Multimedia database search initially relied on metadata associated to each media (sound/image,video, . . . ), such as captions or key- words. This approach suffers from two major drawbacks: tag- ging is a tedious process (which limits its application to small databases), and it does not really capture the meaning of the media. More recently several softwares shifted from a metadata-based ap- proach to a content-based one, resulting notably in the Query By Image Content (QBIC) commercial software [6], as well as several other softwares for image [16, 17, 10] or sound [8] databases. The MediaCycle project, as part of numediart’s HyForge research axis, aims at developing a novel browsing environment that offers an alternative to conventional search-by-query. The databases are organized so that users can conveniently retrieve the items (sound, images, videos) that they need. The architecture of our browsing software is similar to that developed in AudioCycle (numediart #4.1), extending it from sound to images. It offers a wide range of potential applications, from browsing a medical im- ages library to media art installation and live performances (e.g., Resolume [15] or Union VJ [18]. In the case of sound and music (AudioCycle), content was re- ferring to rhythm, harmony, melody, timbre, . . . , whereas in the case of images (MediaCycle) it refers to attributes such as color, shapes, texture, .... Possible extensions include navigation in video databases, where content can additionally be characterized by camera motion parameters (e.g., zoom, pan), object motion, and other dynamic attributes. 2. INTERFACE DESIGN The MediaCycle browsing interface (see snapshot on Fig. 1) con- tains the following elements to browse image databases: • The top right corner contains three sliders, labeled shape, color and texture, that allow the user to define the weights of these three attributes (corresponding to rhythm, timber and harmony for sounds). • In the main frame, the images are displayed in such a way that their mutual closeness reflects their similarity, as de- fined by the cursor’s weights. • The images are clustered according to the abovementioned distance. The user can navigate in and out one such cluster using the arrows above the interface, on the left. • When the mouse hovers on top of an image (as in Fig. 1), this image becomes instantaneously larger, so that the user can quickly browse through the whole database. • The bottom right panel (below the cursors, blank on Fig. 1) serves to display additional information about the images. Figure 1: Overview of the MediaCycle Interface Care has been taken in the development of the software to en- sure a common architecture for all media (sound, image, video), with minimal changes in the interface when switching from one media to another. 3. IMAGE FEATURES Each image contains a wealth of information, that can be readily interpreted by a human eye. However, for the computer an image is simply a set of pixels with values for each color channel (e.g., blue, red, green) or grey level. To compare images in terms that are interpretable by a person, the corresponding features (e.g. color, texture, shape) have to be extracted from the image, as described below. 3.1. Color As pointed out by a recent review of image features [4], color his- tograms generally provide a simple but efficient way to distinguish images. The color range (e.g., from 0 to 255) is partitioned into bins and for each color channel (e.g., blue, red, green) the pixels with a color within a range are counted, resulting in a description 19