, Vol. 14, No. 1, pp. 5563, 2009.
© Association for Scientific Research
Mehmet Ali Salahli
Department of Computer Engineering
Canakkale On Sekiz Mart University, 17100
Canakkale, Turkey
msalahli@comu.edu.tr
In this paper we propose a new approach for measuring semantic relatedness
between words. The semantic relatedness between words are not measured directly, but
are computed via set of words highly related to them, which we call the set of
determiner words. Our approach for evaluating relatedness belongs to web page
counting based measurement methods. We take into account some information, which
contains hierarchical and other type of relations between the words. The experimental
results demonstrate the effectiveness of proposed method.
!"semantic relatedness, semantic similarity, information based measurement,
information content
#$
Measures of relatedness or similarity are used in a variety of applications, such as
information retrieval, automatic indexing, word sense disambiguation, automatic text
correction. Semantic similarity and semantic relatedness are sometimes used
interchangeable in the literature. These terms however, are not identical. Semantic
relatedness indicates degree to which words are associated via any type (such as
synonymy, meronymy, hyponymy, hypernymy, functional, associative and other types)
of semantic relationships. Semantic similarity is a special case of relatedness and takes
into consideration only hyponymy/hypernymy relations. The relatedness measures may
use a combination of the relationships existing between words depending on the context
or their importance. To illustrate difference between similarity and relatedness, Reznik
[1] provides the widely used example of and . These terms are not very
similar; they have only few features in common. But they are more closely related in a
functional context; namely that use . A number of researchers use distance
measure as opposite of similarity.
In this work we propose a new approach for measuring semantic relatedness
between words. Main idea of the approach is that the semantic relatedness between
words is not measured directly, but is determined via a set of words high related to
them, which we call the set of determiner words. Our approach for evaluating
relatedness belongs to web pages counting based measurements methods. But we take
into account some information, expressing hierarchical and other type relations between
the words. Comparison the experimental results with a benchmark set of human
similarity ratings show the effectiveness of the proposed approach.
The paper is organized as follows. Section 2 presents related work. In section 3
motivations on proposed method is given. The method for evaluating semantic