Apostol (Paul) Natsev a John R. Smith b Yuan-Chi Chang b Chung-Sheng Li b Jeffrey S. Vitter a a Department of Computer Science, Duke University, P.O. Box 90129, Durham, NC 27708. Email: natsev, jsv @cs.duke.edu. b IBM T. J. Watson Research Center, 30 Saw Mill River Road, Hawthorne, NY 10532. Email: jsmith, yuanchi, csli @us.ibm.com. ABSTRACT This paper investigates the problem of high-level querying of multimedia data by imposing arbitrary domain-specific constraints among multimedia objects. We argue that the current structured query model, and the query-by-content model, are insufficient for many important applications, and we propose an alternative query framework that unifies and extends the previous two models. The proposed framework is based on the querying-by-concept paradigm, where the query is expressed simply in terms of concepts, regardless of the complexity of the underlying multimedia search engines. The query-by-concept paradigm was previously illustrated by the CAMEL system. The present paper builds upon and extends that work by adding arbitrary constraints and multiple levels of hierarchy in the concept representation model. We consider queries simply as descriptions of virtual data sets, and that allows us to use the same unifying concept rep- resentation for query specification, as well as for data annotation purposes. We also identify some key issues and challenges presented by the new framework, and we outline possible approaches for overcoming them. In particular, we study the problems of concept representation, extraction, refinement, storage, and matching. Keywords: Constraints, concepts, CAMEL, content-based query, images, MPEG-7, multimedia 1. INTRODUCTION The advances in computing power over the last few years have made a considerable amount of multimedia data available on the web and in domain-specific applications. The result has been an increasing emphasis on the requirements for searching and describing such large repositories of multimedia data. One of the first realizations about this new field was the fact that the traditional database model for querying of structured data does not generalize well to multimedia domains of unstructured data. One reason was that in databases, the results were always well-defined and unordered sets (i.e., each item was either in or out of the result set), while in multimedia queries, the results were fuzzy and typically ordered by their degree of similarity to the query. Another difference was in the way the data was queried. While the relational database model treated all items as attribute sets and consequently all queries were on the attributes, it was not apparent how to convert a sample image query into an attribute query in a meaningful and simple fashion. This led to the widespread adoption of the query by example (or query by content) model for image search, where the user would specify a sample image and the system would return images that are similar in color, shape, texture features, etc. The above query model has some advantages, such as comparison and ranking of images based on objective visual features, rather than on subjective image annotations, for example; and automated indexing of the image data, as opposed to labor consuming manual image annotation. However, this model has its drawbacks as well. For one, in some applications it is difficult to find an appropriate query sample from the same domain, or it may be difficult to provide it to the query engine. For example, if the user is looking for images on the Internet that are similar to a specific image, there is no way to upload the query image to the search engine. In most cases, he/she would have to do an extra text search step in the image annotations (file name, URL, surrounding text context, etc.), in order to find suitable image candidates to start the content-based search. However, even then, they may not find a suitable query image, and this reduction of the image search problem to a text search one is only a CONTACT AUTHOR. This work was done while the author was visiting IBM T. J. Watson Research Center.