SUPPORT VECTOR MACHINES FOR REGION-BASED IMAGE RETRIEVAL
*
*
This work was performed at Microsoft Research Asia.
The authors are supported by National Nature Sciences Foundation of China No. 60135010
Feng Jing
State Key Lab of Intelligent
Technology and Systems
Beijing 100084, China
Scenery_JF@hotmail.com
Mingjing Li, Hong-Jiang Zhang
Microsoft Research Asia
49 Zhichun Road
Beijing 100080, China
{mjli, hjzhang}@microsoft.com
Bo Zhang
State Key Lab of Intelligent
Technology and Systems
Beijing 100084, China
dcszb@ mail.tsinghua.edu.cn
ABSTRACT
In this paper, the application of support vector machines
(SVM) in relevance feedback for region-based image
retrieval is investigated. Both the one class SVM as a
class distribution estimator and two classes SVM as a
classifier are taken into account. For the latter, two
representative display strategies are studied. Since the
common kernels often rely on inner product or L
p
norm in
the input space, they are infeasible in the region-based
image retrieval systems that use variable-length
representations. To resolve the issue, a new kind of kernel
that is a generalization of Gaussian kernel is proposed.
Experimental results on a database of 10,000 general-
purpose images demonstrate the effectiveness and
robustness of the proposed approach.
1. INTRODUCTION
Many early content-based image retrieval (CBIR) systems
perform retrieval based primarily on global features [5][9].
It is not unusual that users accessing a CBIR system look
for objects, but the aforementioned systems are likely to
fail, since a single signature computed for the entire image
cannot sufficiently capture the important properties of
individual objects. Region-based image retrieval (RBIR)
methods [1][6][11] attempt to overcome the drawback of
global features by representing images at object-level,
which is intended to be close to the perception of human
visual system.
To narrow down the gap between low-level features
and high-level concepts, relevance feedback (RF) initially
developed in text retrieval [8] was introduced into CBIR
during mid 1990’s and has been shown to provide
dramatic performance boost in retrieval systems [6][10].
The main idea of RF is to let users guide the system.
During retrieval process, the user interacts with the system
and rates the relevance of the retrieved images, according
to his/her subjective judgment. With this additional
information, the system dynamically learns the user’s
intention, and gradually presents better results.
As a core machine learning technology, SVM has not
only strong theoretical foundations but also excellent
empirical successes [3]. SVM has also been introduced
into CRIR as a powerful RF tool, and performs fairly well
in the systems that use global representations [2][10][12].
Since common kernels of SVM usually rely on the inner
product or the L
p
norm in the input space, they are
inappropriate in the RBIR systems [6][11] that use
variable-length representations. To approach the issue, a
new kernel is proposed. It is a generalization of Gaussian
kernel with the Euclidean distance replaced by Earth
Movers’ Distance (EMD) [7].
The remainder of the paper is organized as follows.
In Section 2, the previous works related to SVM in CBIR
are summarized. The basic elements of a region-based
image retrieval system are explained in Section 3. In
Section 4, a new kernel is introduced. Preliminary
experimental results are given in Section 5. Finally, we
conclude in Section 6.
2. SVM IN CBIR
There are at least two criterions to characterize the
application of SVM in CBIR.
2.1. One Class vs. Two Classes
Given the relevance feedback information, generally two
kinds of learning could be done in order to boost the
performance. One is to estimate the distribution of the
target images, while the other is to learn a boundary that
separates the target images from the rest. For the former,
the so-called one-class SVM was adopted [2]. A kernel
II - 21 0-7803-7965-9/03/$17.00 ©2003 IEEE ICME 2003
➠ ➡