978-1-4799-7208-1/14/$31.00 ©2014 IEEE 271 UNSUPERVISED FEATURE APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING PRINCIPAL COMPONENT ANALYSIS MUHAMMAD HAMMAD MEMON School of Computer Science & Engineering, UESTC, Chengdu, 611731, China muhammadhammadmemon@gmail.com JIAN-PING LI School of Computer Science & Engineering, UESTC, Chengdu, 611731, China jpli2222@uestc.edu.cn IMRAN MEMON College of Computer Science, Zhejiang University, Hangzhou, Zhejiang 310027, China imranmemon52@zju.edu.cn RIAZ AHMED SHAIKH School of Computer Science & Engineering, UESTC, Chengdu, 611731, China riaz.shaikh@salu.edu.pk ASIF KHAN School of Computer Science & Engineering, UESTC, Chengdu, 611731, China asif05amu@gmail.com SAMUNDRA DEEP School of Computer Science & Engineering, UESTC, Chengdu, 611731, China samundradeep@gmail.com Abstract: In recent years, there are available extremely large collections of images located on distributed and heterogeneous platforms over the online web service. The proliferation of digital cameras and the growing photo sharing using current technology for browsing such collections, but at the same time it spurred the emergence of new image retrieval techniques based not only on photos' visual information, but on geo-location tags. Currently image retrieval systems; the retrieval process is performed using similarity strategies applied on certain features in the image. In this paper, we proposed a process of image refining retrieval result by exploiting and fusing unsupervised feature technique Principal component analysis (PCA) and spectral clustering. PCA algorithm is used for to remove the outliers from the initially retrieved image set, and then it uses Principal Component Analysis (PCA) to extract principal components of the feature values. Later on, feature values of each image are exhibited by a linear combination of these principal components. Spectral clustering analyzes retrieval process by clustering together visually similar images.PCA and spectral clustering require manual turning of their parameters, which usually requires a priori knowledge of the dataset. To overcome this problem we developed a tuning mechanism that automatically tunes the parameters of both algorithms. For the evaluation of the proposed approach we used thousands of images from Flickr downloaded using text queries for well-known cultural heritage monuments. The proposed method was performed and tested on a set of images from variant sceneries. Experimental results show the superior performance of this approach. Keywords: Image retrieval; Image clustering; Principal Component Analysis; Spectral clustering. 1. Introduction In recent years, there are available extremely large collections of images and videos, located on distributed and heterogeneous platforms over the web. More than 950 million of new images are annually created on the Internet covering not only contemporary events but also historic incidents and cultural heritage artifacts. Image retrieval approaches, based on keywords and textual metadata, face serious challenges. Principal Components Analysis (PCA) is the predominant linear dimensionality reduction technique and it has been widely applied on datasets in all scientific domains, from computer version and graphics area [1]. The advent of the digital camera, along with the new technologies, enhanced digital photos description and spurred the emergence of new image retrieval techniques. Besides visual information, digital photos are characterized by auto generated geo-location tags, including longitude and latitude, and camera exif data, as well as, manually defined human user (photographer or community) tags. However, manual human user image tagging is an inconsistent task and geo-location tags along with camera exif data, in many cases, can distract image retrieval process[2].Consider for example a query containing the words "Acropolis Parthenon" along with longitude and latitude of this monument. A large subset of the retrieved images will not depict the monument. Instead, it will depict the view of Athens from the Parthenon or Acropolis museum exhibits. Although, additional image information may be proven very useful for preliminary image retrieval, the final retrieved result is necessary to be refined by exploiting visual information. The feature vectors encode visual features