978-1-4799-2446-2/13/$31.00 ©2013 IEEE 167 IMAGE SEGMENTATION APPROACH IN MULTIMODAL INFORMATION RETRIEVAL SHAIKH RIAZ AHMED 1 , JIAN-PING LI 1 , MEMON MUHAMMAD HAMMAD 1 , KHAN ASIF 1 1 School of Computer Science & Engineering, University of Electronics Science & Technology, Chengdu, 611731, China E-Mail: riaz.shaikh@salu.edu.pk, jpli2222@uestc.edu.cn Abstract: In recent years, information retrieval has been one of the vast area of research that focuses not only on a single mode of retrieval system but on multimodal (i.e. text, image, audio and video) information retrieval systems that binds multimodalities into a single repository. Multimodal retrieval system provides the features to search/retrieve information that are available in multiple formats. Systems have not reached the initial goal, i.e. to manage and search images in database we are unable to link the semantic sense of an image to numerical values. To meet the requirement as preprocessing step segmentation is used in Content Based Image Retrieval. In this context, we proposed statistical segmentation approach in this paper. After segmentation the features are extracted for the segmented images, which are helpful in understanding the contents of the image and retrieved the information. Keywords: Contents Based Image Retrieval; Image Segmentation; Feature Extraction 1. Introduction Information retrieval systems that bind multimodalities into a single repository, multimodal retrieval system provides the features to search/retrieve information that are available in multiple format i.e. text, image, audio and video. Nowadays multimedia popularity demands intelligent and efficient maneuverings in order to manage with the large amount of multimedia data. Recent efforts in the area of multimedia retrieval systems have led to a growing research community and a number of international, national and industrial projects. Besides focusing on single media retrieval systems, latest technologies target on multimodal retrieval engines. This development explicitly forms the mainstream trend as queries such as "Show me the video and related documents for the given score available by melody and text snippets" (may be by humming) or "Give me all media (text, image, video, audio) containing information about the sea-shore" come into vogue. 2. Background The research field on content-based multimedia retrieval has increased enormously in the last few years. New trends regarding content-based image retrieval has been discussed in [1]. The research field of Multimodal Information Retrieval is focused on strategies that combine simultaneously different media sources, while multimedia information retrieval is oriented to process and work with one multimedia object to provide search functionality by content. Various works have been developed in the area of Multimodal Information Retrieval, and some interesting results have shown the effectiveness of this approach. Multimodal Information Retrieval has not been deeply evaluated for multimedia retrieval, and recent studies suggest the potential advantages of using multimodal synergies in multimedia databases [2]. Early approaches for image retrieval was oriented to use a set of visual features to describe the main structure of image contents and then defining a similarity measure to search for associated images [12][13]. Shortly, thereafter it was clear that the raw visual content was not enough to retrieve relevant images, so the introduction of semantic information was turned as the main problem [7]. Semantic knowledge has been included into content- based information retrieval (CBIR) systems in different ways as is discussed by Liu et al. [10], in which 5 different approaches are presented. The first approach is based on ontology construction, in which low-level features are organized according to semantic categories. The second approach is the use of machine learning to automatically recognize image contents. This approach is often related to the automatic image annotation task [11][12], since the goal of using machine learning