Visual Information Retrieval Using Synthesized Imagery Bart Thomee Mark J. Huiskes Erwin Bakker Michael S. Lew LIACS Media Lab LIACS Media Lab LIACS Media Lab LIACS Media Lab Leiden University Leiden University Leiden University Leiden University bthomee@liacs.nl markh@liacs.nl erwin@liacs.nl mlew@liacs.nl ABSTRACT In this project (VIRSI) we investigate the promising content- based retrieval paradigm known as interactive search or relevance feedback, and aim to extend it through the use of synthetic imagery. In relevance feedback methods, the user himself is a key factor in the search process as he provides positive and negative feedback on the results, which the system uses to iteratively improve the set of candidate results. In our approach we closely integrate the generation of synthetic imagery in the relevance feedback process through a new fundamental paradigm: Artificial Imagination (AIm). Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval – Query formulation, Relevance feedback, Search process. General Terms Algorithms, Human Factors. Keywords Content-based retrieval, principal component analysis, artificial imagination 1. INTRODUCTION Beyond the borders of science, culture, and art, the issue of finding multimedia information has become one of the grand challenges of our time. There has been significant success in searching through text databases or multimedia using text annotations, but in many situations text annotation is not available for multimedia libraries (e.g. [14]). In such cases it is necessary to use content-based retrieval methods which directly analyze the pictorial content of the media. One of the main areas explored in this project is the way in which artificial imagination can be used to aid a computer algorithm in learning new visual concepts, in particular to obtain meaningful solutions to user queries on multimedia databases. Our own visual imagination allows us to create new and useful examples based on our memories and experiences. In artificial imagination we intend to harness similar strategies to intelligently synthesize informative examples. In particular, we intend to endow the computer with the ability to ask whether a particular synthesized example (which is not yet in the database) is relevant. Such an example would for instance target a particular feature that could be important to the query. One example from real life is when a police officer creates a sketch of an unknown person to help identify a criminal. 2. BACKGROUND The earliest years of Visual Information Retrieval (VIR) were characterized by "one-shot" methods based on the assumption that a single query would suffice to obtain useful results. Influential and popular examples of these systems would be the similarity- based search systems of QBIC ([6]) and Virage ([5]) circa mid 90s. Beyond the one-shot queries in the early similarity-based search systems, the next generation of systems attempted to integrate continuous feedback from the user in order to gain improved insight into the user query. The resulting relevance feedback methods (e.g. [3, 4, 8, 9, 10, 11, 15, 21]) usually share the following process: the system shows the user a set of candidate results; the user labels a subset of the candidates as positive or negative examples; based on these examples the system reformulates the representation of the user query, and subsequently presents the user with the next set of candidate results. This process is repeated until, hopefully, the user is satisfied. Relevance feedback can be considered a special case of emergent semantics. Other names used in the computer vision literature have included query refinement, interactive search, and active learning. A general approach is to view relevance feedback as a type of pattern classification where a relevant class is learned from a set of training examples with relevant and irrelevant labels. In principle, it is therefore possible to apply any type of learning algorithm in the relevance feedback loop. One of the major problems in relevance feedback is how to address the typically very small training sets (e.g. [21]); in [20] it was found that combining multiple relevance feedback strategies gives superior results as opposed to any single strategy. In [17], Tieu and Viola proposed a method for applying the AdaBoost learning algorithm and noted that it is quite suitable for relevance feedback due to the fact that AdaBoost works well with small training sets. In [7] a comparison was performed between AdaBoost and SVM and found that SVM gives superior retrieval results. Good overviews can also be found in [12] and [21]. As in our system we mainly want to focus on the feedback-based generation of examples, we use the classic and well-known relevance feedback method proposed by Rocchio ([13]), where the simple idea is to move a query point toward the relevant examples and away from the irrelevant examples. The Rocchio Copyright is held by the author/owner(s). CIVR’07, July 9-11, 2007, Amsterdam, The Netherlands. Copyright 2007 ACM 978-1-59593-733-9/07/0007. 127