828 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 2, FEBRUARY 2012
Object Segmentation of Database Images by Dual
Multiscale Morphological Reconstructions
and Retrieval Applications
Jiann-Jone Chen, Member, IEEE, Chun-Rong Su, W. Eric L. Grimson, Fellow, IEEE,
Jun-Lin Liu, and De-Hui Shiue
Abstract—Processing images for specific targets on a large
scale has to handle various kinds of contents with regular pro-
cessing steps. To segment objects in one image, we utilized dual
multiScalE Graylevel mOrphological open and close recoNstruc-
tions (SEGON) to build a background (BG) gray-level variation
mesh, which can help to identify BG and object regions. It was
developed from a macroscopic perspective on image BG gray
levels and implemented using standard procedures, thus robustly
dealing with large-scale database images. The image segmenta-
tion capability of existing methods can be exploited by the BG
mesh to improve object segmentation accuracy. To evaluate the
segmentation accuracy, the probability of coherent segmentation
labeling, i.e., the normalized probability random index (PRI),
between a computer-segmented image and the hand-labeled one is
computed for comparisons. Content-based image retrieval (CBIR)
was carried out to evaluate the object segmentation capability in
dealing with large-scale database images. Retrieval precision–re-
call (PR) and rank performances, with and without SEGON,
were compared. For multi-instance retrieval with shape feature,
AdaBoost was used to select salient common feature elements. For
color features, the histogram intersection between two scalable
HSV descriptors was calculated, and the mean feature vector was
used for multi-instance retrieval. The distance measure for color
feature can be adapted when both positive and negative queries
are provided. The normalized correlation coefficient of features
among query samples was computed to integrate the similarity
ranks of different features in order to perform multi-instance
with multifeature query. Experiments showed that the proposed
object segmentation method outperforms others by 21% in the
PRI. Performing SEGON-enabled CBIR on large-scale databases
also improves on the PR performance reported elsewhere by up
to 42% at a recall rate of 0.5. The proposed object segmentation
method can be extended to extract other image features, and new
feature types can be incorporated into the algorithm to further
improve the image retrieval performance.
Index Terms—Content-based image retrieval (CBIR), dual
multiscale gray-level morphological reconstructions, image back-
ground (BG) gray-level variation mesh, object segmentation.
Manuscript received September 28, 2009; revised August 21, 2010 and June
09, 2011; accepted August 09, 2011. Date of publication August 30, 2011;
date of current version January 18, 2012. This work was supported in part by
the National Science Council under Grant NSC100-2221-E-011-156 and Grant
NSC99-2218-E-011-002 and in part by the Information and Communications
Research Laboratories, Industrial Technology Research Institute, under Grant
A352BR2100. The associate editor coordinating the review of this manuscript
and approving it for publication was Prof. Ying Wu.
J.-J. Chen and C.-R. Su are with the Department of Electrical Engineering,
National Taiwan University of Science and Technology, Taipei 10673, Taiwan
(e-mail: jjchen@mail.ntust.edu.tw; d9607304@mail.ntust.edu.tw).
W. E. L. Grimson is with the Department of Electrical Engineering and Com-
puter Science, Massachusetts Institute of Technology, Cambridge, MA 02139
USA (e-mail: welg@csail.mit.edu).
J.-L. Liu and D.-H. Shiue are with the Information and Communication Re-
search Laboratories, Industrial Technology Research Institute, Hsinchu 10673,
Taiwan (e-mail: JUNLIN@itri.org.tw; ryan64@itri.org.tw).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIP.2011.2166558
I. INTRODUCTION
C
ONTENT-BASED similarity retrieval for multimedia
data become important since international coding stan-
dards, such as the Joint Photographers Expert Group (JPEG),
the Motion Pictures Expert Group 1 (MPEG-1), MPEG-2,
and MPEG-4, started to be widely used and distributed over
the Internet. When one considers the length and the detail of
hand-coded similarity definitions, one could justly claim that
“one image is worth than a thousand words.” The multimedia
content description standard, i.e., MPEG-7, provides formal de-
scriptors for different applications, such as archival, browsing,
retrieval, etc. Similarity between two media objects can be
evaluated by computing the distance between their numerical
feature descriptors. The distance measure is performed by
reorganizing the descriptor space such that objects more similar
to the query object would yield smaller distances [1]. These
visual descriptors provide accurate similarity measurement by
feature types, such as color, shape, and textures. However, the
capability of MPEG-7 descriptors in measuring the similarity
is limited to the description space. If the descriptors are not
applied to the right feature content in images, improving the
retrieval method alone will not yield accurate retrieved results.
In other words, it is necessary to perform preprocessing on all
database images before applying the descriptors. The purpose
of our research was to develop a robust image object segmen-
tation algorithm with regular processing steps to deal with
large-scale database images.
Concerning visual signal processing, image segmentation
is essential for various applications. It describes the process
whereby each pixel in an image is labeled, such that pixels
with the same label present coherent visual characteristics. This
allow for a semantic approach to image analysis. One way to
perform image segmentation is to simply utilize the clustering
algorithm in the color space domain [2], i.e., HSV or RGB;
segmentation can also be based on the statistics of the color
space description of the image, e.g., color histogram. These
methods are carried out in the color space domain instead of the
image pixel domain, whose results depend on the initial cluster
setting. Edge-based segmentation is simple but it requires a
further linking procedure to segment an image [3]. Among
color region-based approaches, the region-growing approach
[4] provides an initial set of seeds; regions are then grown
by comparing neighboring pixels, which are merged with the
region with the closest mean color. JSEG [5], [6] seeks to divide
1057-7149/$26.00 © 2011 IEEE