978-1-4799-2446-2/13/$31.00 ©2013 IEEE 167
IMAGE SEGMENTATION APPROACH IN MULTIMODAL INFORMATION
RETRIEVAL
SHAIKH RIAZ AHMED
1
, JIAN-PING LI
1
, MEMON MUHAMMAD HAMMAD
1
, KHAN ASIF
1
1
School of Computer Science & Engineering, University of Electronics Science & Technology, Chengdu,
611731, China
E-Mail: riaz.shaikh@salu.edu.pk, jpli2222@uestc.edu.cn
Abstract:
In recent years, information retrieval has been one of
the vast area of research that focuses not only on a single
mode of retrieval system but on multimodal (i.e. text, image,
audio and video) information retrieval systems that binds
multimodalities into a single repository. Multimodal
retrieval system provides the features to search/retrieve
information that are available in multiple formats. Systems
have not reached the initial goal, i.e. to manage and search
images in database we are unable to link the semantic sense
of an image to numerical values. To meet the requirement
as preprocessing step segmentation is used in Content
Based Image Retrieval. In this context, we proposed
statistical segmentation approach in this paper. After
segmentation the features are extracted for the segmented
images, which are helpful in understanding the contents of
the image and retrieved the information.
Keywords:
Contents Based Image Retrieval; Image Segmentation;
Feature Extraction
1. Introduction
Information retrieval systems that bind
multimodalities into a single repository, multimodal
retrieval system provides the features to search/retrieve
information that are available in multiple format i.e. text,
image, audio and video. Nowadays multimedia
popularity demands intelligent and efficient
maneuverings in order to manage with the large amount
of multimedia data. Recent efforts in the area of
multimedia retrieval systems have led to a growing
research community and a number of international,
national and industrial projects. Besides focusing on
single media retrieval systems, latest technologies target
on multimodal retrieval engines. This development
explicitly forms the mainstream trend as queries such as
"Show me the video and related documents for the given
score available by melody and text snippets" (may be by
humming) or "Give me all media (text, image, video,
audio) containing information about the sea-shore" come
into vogue.
2. Background
The research field on content-based multimedia
retrieval has increased enormously in the last few years.
New trends regarding content-based image retrieval has
been discussed in [1]. The research field of Multimodal
Information Retrieval is focused on strategies that
combine simultaneously different media sources, while
multimedia information retrieval is oriented to process
and work with one multimedia object to provide search
functionality by content. Various works have been
developed in the area of Multimodal Information
Retrieval, and some interesting results have shown the
effectiveness of this approach. Multimodal Information
Retrieval has not been deeply evaluated for multimedia
retrieval, and recent studies suggest the potential
advantages of using multimodal synergies in multimedia
databases [2].
Early approaches for image retrieval was oriented to
use a set of visual features to describe the main structure
of image contents and then defining a similarity measure
to search for associated images [12][13]. Shortly,
thereafter it was clear that the raw visual content was not
enough to retrieve relevant images, so the introduction of
semantic information was turned as the main problem [7].
Semantic knowledge has been included into content-
based information retrieval (CBIR) systems in different
ways as is discussed by Liu et al. [10], in which 5
different approaches are presented. The first approach is
based on ontology construction, in which low-level
features are organized according to semantic categories.
The second approach is the use of machine learning to
automatically recognize image contents. This approach is
often related to the automatic image annotation task
[11][12], since the goal of using machine learning