Research Journal of Applied Sciences, Engineering and Technology 7(14): 2806-2812, 2014 ISSN: 2040-7459; e-ISSN: 2040-7467 © Maxwell Scientific Organization, 2014 Submitted: May 01, 2012 Accepted: May 26, 2012 Published: April 12, 2014 Corresponding Author: Abbas M. Ali, Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebagsaan Malaysia, Bangi, Selangor, Malaysia 2806 A Spatial Visual Words of Discrete Image Scene for Indoor Localization Abbas M. Ali, Md Jan Nordin and Azizi Abdullah Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebagsaan Malaysia, Bangi, Selangor, Malaysia Abstract: One of the fundamental problems in accurate indoor place recognition is the presence of similar scene images in different places in the environmental space of the mobile robot, such as the presence of computer or office table in many rooms. This problem causes bewilderment and confusion among different places. To overcome this, the local features of these image scenes should be represented in more discriminate and more robust way. However to perform this, the spatial relation of the local features should be considered. This study introduces a novel approach for place recognition based on correlation degree for the entropy of covariance feature vectors. In fact, these feature vectors are being extracted from the minimum distance of SIFT grid features of the image scene and optimized K entries from the codebook which is constructed by K means. The Entropy of Covariance features (ECV) issued to represent the scene image in order to remove the confusion of similar images that are related to different places. The conclusion observed from the acquired results showed that this approach has a stable manner due to its reliability in the place recognition for the robot localization and outperforms the other approaches. Finally, the proposed ECV approach gives an intelligent way for the robot localization through the correlation of entropy covariance feature vectors for the scene images. Keywords: Entropy covariance features vectors, grid, place recognition, SIFT K-means INTRODUCTION Place recognition is one of the basic issues in mobile robotics based localization through the environmental navigation. One of the fundamental problems in the visual place recognition is the confusion of matching visual scene image with the stored database images. This problem is caused by instability of local features representation. Machine learning is used to improve the localization process for known or unknown environments. This led the process to have two modes, supervised mode like (Booij et al., 2009; Wnuk et al., 2004; Oscar et al., 2007; Miro et al., 2006) and unsupervised mode, like (Abdullah et al., 2010). The most common tools used in machine learning is the K-means clustering technique to cluster all probabilistic features in the scene images in order to construct the codebook. Several works used clustering technique, where the image local features in a training set are quantized into a “vocabulary” of visual words (Ho and Newman, 2007; Cummins and Newman, 2009; Schindler et al., 2007). Clustering technique may reduce the dimensionality of features and the noise by the quantization of local features into the visual words. The process of quantizing the features is quite similar with the Bag of Words (BOW) model as in Uijlings et al. (2009). However, these visual words do not possess spatial relations. The BOW model is employed to get more accurate features for describing the scene image in place recognition. In Cummins and Newman (2009), they used BOW to describe an appearance for Simultaneous Localization and Mapping (SLAM) system, which was used for a large scale rout of images. In Schindler et al. (2007) an informative features was proposed to be added to each location and vocabulary trees (Nister and Stewenius, 2006) for recognized location in the database. In contrast, (Jan et al., 2010) measured only the statistics of mismatched features and that required only negative training data in the form of highly ranked mismatched images for a particular location. In Matej et al. (2002), an incremental eigen space model was proposed to represent the panoramic scene images, which was taken from different locations, for the sake of incremental learning without the need to store all the input data. The study in Iwan and Illah (2000) was based on color histograms for images taken from the omnidirectional sensor, these histograms were used for appearance based localization. Recently, most works in this area are focusing on large-scale navigation environments. For example, in Murillo and Kosecka (2009) a global descriptor for portions of panoramic images was used for similar measurements to match images for a large scale outdoor Street View dataset. In Jana et al. (2003) qualitative topological localization established by segmentation of temporally adjacent