978-1-4799-7208-1/14/$31.00 ©2014 IEEE 271
UNSUPERVISED FEATURE APPROACH FOR CONTENT BASED IMAGE
RETRIEVAL USING PRINCIPAL COMPONENT ANALYSIS
MUHAMMAD HAMMAD
MEMON
School of Computer Science & Engineering,
UESTC, Chengdu, 611731, China
muhammadhammadmemon@gmail.com
JIAN-PING LI
School of Computer Science & Engineering,
UESTC, Chengdu, 611731, China
jpli2222@uestc.edu.cn
IMRAN MEMON
College of Computer Science, Zhejiang
University, Hangzhou, Zhejiang
310027, China
imranmemon52@zju.edu.cn
RIAZ AHMED SHAIKH
School of Computer Science & Engineering,
UESTC, Chengdu, 611731, China
riaz.shaikh@salu.edu.pk
ASIF KHAN
School of Computer Science & Engineering,
UESTC, Chengdu, 611731, China
asif05amu@gmail.com
SAMUNDRA DEEP
School of Computer Science & Engineering,
UESTC, Chengdu, 611731, China
samundradeep@gmail.com
Abstract:
In recent years, there are available extremely large
collections of images located on distributed and
heterogeneous platforms over the online web service. The
proliferation of digital cameras and the growing photo
sharing using current technology for browsing such
collections, but at the same time it spurred the emergence
of new image retrieval techniques based not only on photos'
visual information, but on geo-location tags. Currently
image retrieval systems; the retrieval process is performed
using similarity strategies applied on certain features in the
image. In this paper, we proposed a process of image
refining retrieval result by exploiting and fusing
unsupervised feature technique Principal component
analysis (PCA) and spectral clustering. PCA algorithm is
used for to remove the outliers from the initially retrieved
image set, and then it uses Principal Component Analysis
(PCA) to extract principal components of the feature values.
Later on, feature values of each image are exhibited by a
linear combination of these principal components. Spectral
clustering analyzes retrieval process by clustering together
visually similar images.PCA and spectral clustering require
manual turning of their parameters, which usually requires
a priori knowledge of the dataset. To overcome this
problem we developed a tuning mechanism that
automatically tunes the parameters of both algorithms. For
the evaluation of the proposed approach we used thousands
of images from Flickr downloaded using text queries for
well-known cultural heritage monuments. The proposed
method was performed and tested on a set of images from
variant sceneries. Experimental results show the superior
performance of this approach.
Keywords:
Image retrieval; Image clustering; Principal
Component Analysis; Spectral clustering.
1. Introduction
In recent years, there are available extremely large
collections of images and videos, located on distributed
and heterogeneous platforms over the web. More than
950 million of new images are annually created on the
Internet covering not only contemporary events but also
historic incidents and cultural heritage artifacts. Image
retrieval approaches, based on keywords and textual
metadata, face serious challenges. Principal Components
Analysis (PCA) is the predominant linear dimensionality
reduction technique and it has been widely applied on
datasets in all scientific domains, from computer version
and graphics area [1].
The advent of the digital camera, along with the
new technologies, enhanced digital photos description
and spurred the emergence of new image retrieval
techniques. Besides visual information, digital photos are
characterized by auto generated geo-location tags,
including longitude and latitude, and camera exif data, as
well as, manually defined human user (photographer or
community) tags. However, manual human user image
tagging is an inconsistent task and geo-location tags
along with camera exif data, in many cases, can distract
image retrieval process[2].Consider for example a query
containing the words "Acropolis Parthenon" along with
longitude and latitude of this monument. A large subset
of the retrieved images will not depict the monument.
Instead, it will depict the view of Athens from the
Parthenon or Acropolis museum exhibits. Although,
additional image information may be proven very useful
for preliminary image retrieval, the final retrieved result
is necessary to be refined by exploiting visual
information. The feature vectors encode visual features