International Journal of Engineering Innovation & Research
Volume 1, Issue 2, ISSN : 2277 – 5668
70
Copyright © 2012 IJEIR, All right reserved
Automatic Shape Annotation Using Rough Sets and
Decision Trees
Manoj P. Patil
Department of Computer Science
North Maharashtra University, Jalgaon, (M. S.)
mpp145@gmail.com
Satish R.Kolhe
Department of Computer Science
North Maharashtra University, Jalgaon, (M. S.)
srkolhe2000@gmail.com
Abstract — Annotation of images automatically assigns
tags to images by analyzing contents of images. Shape is the
most important feature of images, by using this features
tagging of images is possible, can be termed as automatic
shape annotation. In this paper, a novel classifiers using
machine learning techniques viz. Rough Set (RS) and
Decision Tree (DT) are presented to classify shape images of
a standard dataset for annotation purpose. Shape based
features are extracted and organized to form a shape feature.
Rough Set Exploration System (RSES) is used to develop
decision tree based, rough set based classifiers for the tagging
of shapes. The results obtained using these classifiers are
presented and discussed. The RS classifier significantly
improves the annotation performance.
Keywords — Automatic image annotation, shape features,
decision tree, rough sets.
I. INTRODUCTION
The description of the object shape is an important task
in image analysis and pattern recognition. The shapes
occurring in the images have also a remarkable
significance in image retrieval [1]. The ever growing
number of images generated everyday is the reason to
develop, evaluate and implement sophisticated automatic
annotation system for the retrieval of images from large
databases based on their content rather than their manual
annotations. Although computers are still a long way from
identifying and textually describing image concepts in the
way humans do, it is possible to train computers on large
previously annotated image databases, in order to learn the
associations between visual image data and their textual
descriptions [2].
These automatic image annotation systems have
received intensive attention in the literature of image
information retrieval since this area was started years ago,
and consequently a broad range of techniques have been
proposed. The algorithms used in these systems perform
four tasks namely feature extraction, feature selection,
training annotation system, and annotation of new images.
The extraction task transforms rich content of images
into a set of features. Feature extraction is a special form
of dimensionality reduction. The generated features are to
be used in selecting a subset of features. Feature selection
reduces the number of features provided to train the
system. The features which are likely to assist in
discrimination are selected and used in the annotation task.
Features those are similar and cannot discriminate shapes
are not selected and hence discarded. A set of features is
end result of the extraction process commonly called a
feature vector, which composes a representation of the
image.
Among other generic image features like color and
texture that are used to achieve the classification objective,
shape is considered the most promising for the
identification of entities in an image [3].
Shape is a fundamental image feature and one of the
most important image feature used in Image Annotation
and Retrieval. This feature alone provides capability to
recognize, classify objects and retrieve similar images on
the basis of their contents [4].
Among the classification algorithms decision tree
algorithms is the most commonly used because it is easy
to understand and cheap to implement. It provides a
modeling technique that is easy for human to comprehend
and simplifies the classification process [5]. A decision
tree can be constructed from a set of instances by a divide-
and conquer strategy. If all the instances belong to the
same class, the tree is a leaf with that class as label.
Otherwise, a test is chosen that has different outcomes for
at least two of the instances, which are partitioned
according to this outcome. The tree has as its root a node
specifying the test and for each outcome in turn, the
corresponding sub-tree is obtained by applying the same
procedure to the subset of instances with that outcome.
Rough set theory can be regarded as a new
mathematical tool for imperfect data analysis. Rough set
philosophy is founded on the assumption that with every
object of the universe of discourse some information (data,
knowledge) is associated. Objects characterized by the
same information are indiscernible (similar) in view of the
available information about them. The in-discernibility
relation generated in this way is the mathematical basis of
rough set theory. Any set of all indiscernible (similar)
objects is called an elementary set, and forms a basic
granule (atom) of knowledge about the universe. Any
union of some elementary sets is referred to as a crisp
(precise) set – otherwise the set is rough (imprecise,
vague).
In this paper automatic annotation of shapes using
decision trees and rough sets techniques is discussed. A
novel classifier using Rough Set (RS) is presented to
classify shape images of a standard dataset for annotation
purpose. Shape features are extracted from the input
images and then classification is done. Decision tree
generation, discretization and rule extraction for rough
sets is accomplished using RSES. Classifiers using
decision tree and rough sets techniques are formulated in
RSES.
The description of the use of various machine learning
techniques for classification is provided in Section 2.