Content-Based Image Retrieval Using Combined Features and Weighted Similarity Ahmed Hosny El-Kholy Dept. of Computer Science Faculty of Computers & Information Assiut Univ., Assiut 71526, EGYPT Email:ahmed.ibrahim@compit.au.edu.eg Ahmed Mahmoud Abdel-Haleim Dept. of Information Technology Faculty of Computers & Information Assiut Univ., Assiut 71526, EGYPT Email: ahmedmahmoud@assiut.edu.eg Abdel-Rahman Hedar Dept. of Computer Science Faculty of Computers & Information Assiut Univ., Assiut 71526, EGYPT Email: hedar@aun.edu.eg Abstract—The process of retrieving desired images from a large collection of images on the web pages on the basis of features is referred as Content Based Image Retrieval (CBIR) Search Engine. In this paper, a new CBIR search engine is proposed using combined of two features and weighted similarity. Some experimental simulations have been presented to show the efficiency of the proposed search engine. Keywords-component;Clustering, CBIR, Search Engine, Image Search I. INTRODUCTION The Web is a complex and unique source of multimedia information. Digitized text, audio, images and video link to each other on the Web of information bearing objects. In this paper, we focus on users’ searching for images. It is estimated that there are 20 billion images in Imageshack and Facebook holds 15 billion photos [15]. The third largest image warehouse on the Web appears to be News Corp’s PhotoBucket, with 7.2 billion photos. And then Yahoo’s Flickr comes in at 3.4 billion, which also includes some videos. Interestingly, coming in right behind Flickr in the photo count is social network multiply, with 3 billion images [15]. Problems with text-based access to images have prompted increasing interest in the development of image- based solutions. This is most often referred to as content- based image retrieval (CBIR). CBIR Search Engine relies on the characterization of primitive features such as color, shape, and texture that are automatically extracted from the images themselves. There are several techniques to deal with CBIR problems. In [12] [13], CBIR methods are proposed by using color feature. Retrieving images based on their shape feature have also been considered in many researches works [11]. In this paper, the features of colours and shapes have been combined with weights determined by the user to compose a new CBIR method. The proposed method invokes a weighted similarity function for searching the best images related to the input. A search engine is implemented based on the proposed method to give a quick response with efficient outputs. Several simulations, some of them are shown latter, have been done to test the efficiency of the proposed method and its search engine. The paper is organized as follows. In the next section, we give the basic preliminaries needed throughout the paper. In Section III, we show the proposed method for our system. Section IV presents some details of system implementation. In Section V, we report numerical results for two types of benchmark problems. Finally, conclusion remarks and future work make up Section VI. II. PRELIMINARIES A. Feature Extraction Methods Feature extraction of images is a method of capturing visual content of them for indexing and retrieval [14]. The features should carry enough information about the image and should not require any domain-specific knowledge for their extraction. Also, they should be easy to compute in order for the approach to be feasible for a large image collection and rapid retrieval. In addition, they should be well related with the human perceptual characteristics since users will finally determine the suitability of the retrieved images. In Content Based Image Retrieval Search Engine, the challenge is to choose features that satisfy the previous characteristics and retrieve the result to users in reasonable time. We will discuss later two of feature extraction methods we used in our search engine and the way we use them in image retrieval. B. Clustering It is more suitable and has a great benefit to classify the images into clusters so as to be able to reduce the search domain in such search engines. Clusters are connected regions of a multi-dimensional space containing a relatively 2010 2nd International Conference on Computer Technology and Development (ICCTD 2010) 645 978-1-4244-8843-8/10/＄26.00  2010 IEEE