Application of Relevance Feedback in Content Based Image Retrieval Using Gaussian Mixture Models Apostolos Marakakis 1 , Nikolaos Galatsanos 2 , Aristidis Likas 2 , and Andreas Stafylopatis 1 1 School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece 2 Department of Computer Science, University of Ioannina, 45110 Ioannina, Greece amara@central.ntua.gr, {galatsanos, arly}@cs.uoi.gr, andreas@cs.ntua.gr Abstract In this paper a relevance feedback (RF) approach for content based image retrieval (CBIR) is described and evaluated. The approach uses Gaussian Mixture (GM) models of the image features and a query that is updated in a probabilistic manner. This update reflects the preferences of the user and is based on the models of both positive and negative feedback images. Retrieval is based on a recently proposed distance measure between probability density functions (pdfs), which can be computed in closed form for GM models. The proposed approach takes advantage of the form of this distance measure and updates it very efficiently based on the models of the user specified relevant and irrelevant images. For evaluation purposes, comparative experimental results are presented that demonstrate the merits of the proposed methodology. 1. Introduction The target of content-based image retrieval (CBIR) is to retrieve relevant images from an image database based on the similarity of their visual content with one or more query images. These query images are submitted by the user as examples of his/her preferences. Then, the CBIR system ranks the database images and displays the retrieved results ordered with respect to their similarity with the query images. Most CBIR systems, e.g. [1]–[8], model each image using a combination of low-level features, and then define a distance metric that is used to quantify the similarity between image models. Nevertheless, low-level image features cannot always capture the human perception of image similarity. In other words, it is difficult using only low-level image features to describe the semantic content of an image. This is known in the CBIR community as the semantic gap problem [11]. Relevance feedback (RF), has been proposed as a methodology to ameliorate this problem, e.g. [1]-[3] and [6]-[8]. RF attempts to insert the subjective human perception of image similarity into a CBIR system. Thus, RF is an interactive process that refines the distance between the query and the database images through interaction with the user and taking into account his/her preferences. To accomplish this, during a round of RF, users are required to rate the relevance of the retrieved images according to their preferences. Then, the retrieval system updates the matching criterion based on the user’s feedback, e.g. [1]–[3], [6]–[8], [15] and [16]. In what concerns RF approaches proposed in literature, there is much work which has been done during the last years that can be classified in two main categories. The first category concerns learning-based methods, i.e. it includes the methods which are based on some learning model (usually SVMs) in order to train a classifier to distinguish between the positive and negative feedback examples, e.g. [28], [6], [26]. The main drawback of the learning-based approach is that for every feedback round a new classifier must be trained taking into account both the previously presented examples and the new ones presented in the last feedback round. The second category of RF methods (model-based methods) includes those that attempt to model the statistical distribution of feedback examples in the feature space. These methods can be further divided in two subcategories. The first subcategory includes methods that make the assumption that the feedback examples form one cluster in feature space. The cornerstone of such methods is MindReader [1]. Other methods that work under this assumption are presented in [12], [16], [34], [35]. The single cluster assumption which is made by these methods is usually very restrictive even for the set of positive examples. Moreover, the negative feedback examples cannot be taken into account because they naturally spread across different semantic categories, so it cannot be claimed that they form one cluster. Nevertheless, in 2008 20th IEEE International Conference on Tools with Artificial Intelligence 1082-3409/08 $25.00 © 2008 IEEE DOI 10.1109/ICTAI.2008.110 141 2008 20th IEEE International Conference on Tools with Artificial Intelligence 1082-3409/08 $25.00 © 2008 IEEE DOI 10.1109/ICTAI.2008.110 141 2008 20th IEEE International Conference on Tools with Artificial Intelligence 1082-3409/08 $25.00 © 2008 IEEE DOI 10.1109/ICTAI.2008.110 141