IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 20, NO. 10, OCTOBER 2018 2761
Query Adaptive Multiview Object Instance Search
and Localization Using Sketches
Sreyasee Das Bhattacharjee , Member, IEEE, Junsong Yuan , Senior Member, IEEE, Yicheng Huang,
Jingjing Meng, Member, IEEE, and Lingyu Duan , Member, IEEE
Abstract—Sketch-based object search is a challenging problem
mainly due to three difficulties: 1) how to match the primary sketch
query with the colorful image; 2) how to locate the small object in a
big image that is similar to the sketch query; and 3) given the large
image database, how to ensure an efficient search scheme that is
reasonably scalable. To address the above challenges, we propose
leveraging object proposals for object search and localization.
However, instead of purely relying on sketch features, we propose
fully utilizing the appearance features of object proposals to resolve
the ambiguities between the matching sketch query and object
proposals. Our proposed query adaptive search is formulated as a
subgraph selection problem, which can be solved by the maximum
flow algorithm. By performing query expansion, it can accurately
locate the small target objects in a cluttered background or densely
drawn deformation-intensive cartoon (Manga like) images. To
improve the computing efficiency of matching proposal candidates,
the proposed Multi View Spatially Constrained Proposal Selection
encodes each identified object proposal in terms of a small local
basis of anchor objects. The results on benchmark datasets validate
the advantages of utilizing both the sketch and appearance features
for sketch-based search, while ensuring sufficient scalability at the
same time.
Index Terms—Sketch Based Search, object localization, object
recognition, object retrieval, multi-view proposal selection,
transductive clustering.
I. INTRODUCTION
T
HE task of the object instance search is to retrieve and lo-
calize all similar objects in the database images. An enor-
mous amount of image and visual data being available via sev-
eral web based resources like Flickr, Facebook etc., an effective
Manuscript received October 14, 2017; revised January 8, 2018 and February
11, 2018; accepted February 11, 2018. Date of publication March 9, 2018; date of
current version September 18, 2018. This work was supported in part by the Na-
tional Natural Science Foundation of China under Grant 61661146005 and Grant
U1611461, in part by the Key Research and Development Program of Beijing
Municipal Science and Technology Commission (No. D171100003517002),
and in part by the PKU-NTU Joint Research Institute through the Ng Teng
Fong Charitable Foundation, and start-up grants of the University at Buffalo.
The associate editor coordinating the review of this manuscript and approv-
ing it for publication was Dr. Tao Mei. (Corresponding author: Sreyasee Das
Bhattacharjee.)
S. D. Bhattacharjee is with the Department of Computer Science, University
of North Carolina, Charlotte, NC 28223 USA (e-mail:, sreya.iitm@gmail.com).
J. Yuan and J. Meng are with the Department of Computer Science & En-
gineering, State University of New York at Buffalo, Buffalo, NY 14260 USA
(e-mail:, jsyuan@ntu.edu.sg; jingjing.meng@ntu.edu.sg).
Y. Huang and L. Duan are with the School of Electronics Engineering
and Computer Science, Peking University, Beijing 100080, China (e-mail:,
anorange0409@gmail.com; lingyu@pku.edu.cn).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TMM.2018.2814338
search module can support automatic annotation of multimedia
contents and help content-based retrieval. Although sufficient
works have been reported to efficiently explore image/object
level similarities [1], [2] for various application scenarios, ob-
taining precise image example sufficing the user specification
may not be always handy and in such cases sketch can be an
alternative solution to initialize the search. Although the hand
drawn sketch may not be precise, if drawn with care, it can still
provide sufficient amount of object details to achieve an effective
instance search [3]–[6]. Despite previous work of sketch-based
image retrieval, object instance search model needs to address
three main challenges: (1) Sketches are far from being complete
in terms of the object information that would be critical for a
reliable search performance. For example, if a user is looking
for some ‘pyramid’ images, only drawing a ‘triangle’ is not
sufficiently discriminative to uniquely resemble the pyramids.
On the other hand, the precise image example sufficing the user
specification may also not be handy in every instance. Like, in a
public gathering, when you notice a stranger carrying a fashion-
able handbag which you would desperately want to buy, taking a
photograph is always not very decent. Instead, drawing a sketch
of its shape or at least its displayed logo (which may not be
among your known brands) with fingers in the smartphone will
probably be easier. Therefore, with the advent of touch screen
devices, sketch based query input is indeed a viable and more ef-
fective option for the present generation of users. (2) Accurately
matching and locating the small objects of interest in a big image
of significantly cluttered background is still challenging. To the
best of our knowledge, such a localization problem is not fully
explored in the previous works of sketch based image retrieval.
(3) The challenges continue to become more critical with the
ever increasing database size in time, as the system is expected
to remain sufficiently scalable to maintain its effectiveness.
The proposed graph-based search optimization framework is
to enable query adaptive object instance search using sketches.
The first challenge is that the quality of the sketch provided by
a random user may not be satisfactory always, which makes
the performance deteriorate. Therefore, instead of purely re-
lying on the sketch features, we also explore the appearance
based similarity among database images in a graph regulariza-
tion framework to improve the search quality. Object proposals
are used to identify a small number of candidate object regions
irrespective of its sizes, which enable to evaluate object level
similarities to ensure a reliable localization performance as well.
Fig. 1 describes the entire framework in details.
1520-9210 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.