FLEXIBLE DESCRIPTOR INDEXING FOR MULTIMEDIA DATABASES Catalin Calistru INESC-Porto FCT PhD grant SFRH/BD/12426/2003 cmc@inescporto.pt Cristina Ribeiro, Gabriel David FEUP/INESC-Porto mcr@fe.up.pt, gtd@fe.up.pt ABSTRACT Current multimedia applications require efficient tools for modeling and search. While solid models must support a large variety of concepts like sets, sub-sets, part-of hierar- chies and integrate standards like Dublin Core, MPEG-7 (descriptors, descriptor schemes), TV-Anytime, MPEG-21 (digital item, digital item adaptation, digital item identifica- tion), the search algorithms must deal with heterogeneous multimedia data (text, image, audio and video). The search in audiovisual data requires the use of heterogeneous meta- data representing a wide range of features, from low-level ones (color, motion) to high level ones (subject, mood), or domain-dependent concepts. The current work is part of the development of a pro- totype Multimedia Database and addresses the problem of choosing the proper storage and search strategy for the de- scriptors in the multimedia database. It includes a review of the main applicable data structures and search techniques and a case study from the point of view of our system re- quirements. 1. INTRODUCTION Multimedia databases may have big diversity both in the nature of stored objects and in the application domains. The database may be a repository of objects which have been collected and appropriately described (as in a digital library) or may consist of the heterogeneous information assembled and dynamically modified in a Web site, or still result from the production process of a publisher or a broadcaster. Many standards have now been established for multime- dia information, both the objects themselves and for the cor- responding metadata, the focus of this paper. Metadata may describe the context in which the MM objects have been created, are stored or can be used, or describe aspects of the object content. Metadata standards can be applied to traditional descriptive metadata (Dublin Core), to content metadata (MPEG-7) and to the various aspects of assem- bling components, handling digital rights or adapting the content to specific players (MPEG-21). Standards help to clarify concepts and promote interoperability, but they are currently not appropriate as data models in a MM DB [1]. We have argued in favor of a multimedia database model structured around a limited number of central concepts that capture the main aspects of the objects to be stored and can easily import descriptions using the current standards. An appropriate model should easily handle large volumes of in- formation, be extensible and integrate flexible search mech- anisms. Relational DB management systems provide a natural setup for highly structured data, efficient implementation for the most common data operations and a standard inter- rogation language and respond well to the scalability and search flexibility criteria. However, they lack the extensibil- ity which can be provided by XML, which has become a de facto standard for data representation and interchange. A multimedia database must account for the storage and retrieval of both the multimedia objects and the associated metadata. An object’s metadata includes generic descriptors like titles and dates, applying to any object, and others that are specific to certain types of objects (an audio descriptor has no meaning for a textual object, for instance). The scope of this work is the representation and index- ing of metadata in a multimedia database where items can be assembled from diverse sources. Current developments are being tested in a prototype multimedia database (Meta- Media [2]). Its data model has three main underlying prin- ciples. The first one is that multimedia objects are usually represented in a hierarchical manner, allowing sets of items to be treated as objects that can have associated descriptions. The second one is that the same kind of descriptors can be used for an individual object and for a set of related objects, allowing descriptors to be inherited from a collection to a sub-collection down to individual objects. The third one is that the structure of descriptors should be left as open-ended as possible. The first two principles have been followed in the stan- dards for archival description [3, 4, 5] and prove themselves very useful when it comes to the representation of large col- lections: metadata is frequently available for sets of items rather than individual ones, and inheritance can make it use-