Application of Relevance Feedback in Content Based Image Retrieval
Using Gaussian Mixture Models
Apostolos Marakakis
1
, Nikolaos Galatsanos
2
, Aristidis Likas
2
, and Andreas Stafylopatis
1
1
School of Electrical and Computer Engineering,
National Technical University of Athens, 15780 Athens, Greece
2
Department of Computer Science, University of Ioannina, 45110 Ioannina, Greece
amara@central.ntua.gr, {galatsanos, arly}@cs.uoi.gr, andreas@cs.ntua.gr
Abstract
In this paper a relevance feedback (RF) approach for
content based image retrieval (CBIR) is described and
evaluated. The approach uses Gaussian Mixture (GM)
models of the image features and a query that is updated
in a probabilistic manner. This update reflects the
preferences of the user and is based on the models of both
positive and negative feedback images. Retrieval is based
on a recently proposed distance measure between
probability density functions (pdfs), which can be
computed in closed form for GM models. The proposed
approach takes advantage of the form of this distance
measure and updates it very efficiently based on the
models of the user specified relevant and irrelevant
images. For evaluation purposes, comparative
experimental results are presented that demonstrate the
merits of the proposed methodology.
1. Introduction
The target of content-based image retrieval (CBIR) is
to retrieve relevant images from an image database based
on the similarity of their visual content with one or more
query images. These query images are submitted by the
user as examples of his/her preferences. Then, the CBIR
system ranks the database images and displays the
retrieved results ordered with respect to their similarity
with the query images. Most CBIR systems, e.g. [1]–[8],
model each image using a combination of low-level
features, and then define a distance metric that is used to
quantify the similarity between image models.
Nevertheless, low-level image features cannot always
capture the human perception of image similarity. In other
words, it is difficult using only low-level image features
to describe the semantic content of an image. This is
known in the CBIR community as the semantic gap
problem [11].
Relevance feedback (RF), has been proposed as a
methodology to ameliorate this problem, e.g. [1]-[3] and
[6]-[8]. RF attempts to insert the subjective human
perception of image similarity into a CBIR system. Thus,
RF is an interactive process that refines the distance
between the query and the database images through
interaction with the user and taking into account his/her
preferences. To accomplish this, during a round of RF,
users are required to rate the relevance of the retrieved
images according to their preferences. Then, the retrieval
system updates the matching criterion based on the user’s
feedback, e.g. [1]–[3], [6]–[8], [15] and [16].
In what concerns RF approaches proposed in
literature, there is much work which has been done during
the last years that can be classified in two main
categories. The first category concerns learning-based
methods, i.e. it includes the methods which are based on
some learning model (usually SVMs) in order to train a
classifier to distinguish between the positive and negative
feedback examples, e.g. [28], [6], [26]. The main
drawback of the learning-based approach is that for every
feedback round a new classifier must be trained taking
into account both the previously presented examples and
the new ones presented in the last feedback round.
The second category of RF methods (model-based
methods) includes those that attempt to model the
statistical distribution of feedback examples in the feature
space. These methods can be further divided in two
subcategories.
The first subcategory includes methods that make the
assumption that the feedback examples form one cluster
in feature space. The cornerstone of such methods is
MindReader [1]. Other methods that work under this
assumption are presented in [12], [16], [34], [35]. The
single cluster assumption which is made by these methods
is usually very restrictive even for the set of positive
examples. Moreover, the negative feedback examples
cannot be taken into account because they naturally
spread across different semantic categories, so it cannot
be claimed that they form one cluster. Nevertheless, in
2008 20th IEEE International Conference on Tools with Artificial Intelligence
1082-3409/08 $25.00 © 2008 IEEE
DOI 10.1109/ICTAI.2008.110
141
2008 20th IEEE International Conference on Tools with Artificial Intelligence
1082-3409/08 $25.00 © 2008 IEEE
DOI 10.1109/ICTAI.2008.110
141
2008 20th IEEE International Conference on Tools with Artificial Intelligence
1082-3409/08 $25.00 © 2008 IEEE
DOI 10.1109/ICTAI.2008.110
141