A Meronomic Relatedness Measure for Domain Ontologies Using Concept Probability and Multiset Theory Paul Witherell, Sundar Krishnamurty, Ian Grosse Department of Mechanical and Industrial Engineering University of Massachusetts Amherst Amherst, MA Jack Wileden Department of Computer Science University of Massachusetts Amherst Amherst, MA Abstract—Semantic relatedness measures provide a means to determine how closely related two concepts may or may not be. In the area of ontology alignment, many lexical-based relatedness measures have been successfully applied within the realm of domain ontologies. The alignment initiative, however, has not included all measures of relatedness. More generic measures of relatedness, such as meronomy-based, have yet to be established beyond lexical ontologies. This paper introduces an algorithm for measuring meronomic relatedness between concepts within a domain ontology. Specifically, a new method is proposed for measuring how much one concept is “part of” another in a domain ontology. This is accomplished by utilizing inherent attributes of these ontologies in concert with protocols currently applied in established relatedness measures. Key features of this method include a unique approach to the weighted edge measure, one in which each edge is weighted based on applying a concept probability algorithm to a multiset composed of ontology property ranges. The application of this method is then illustrated with the aid of two case-studies, namely a camera ontology and a wine ontology, and the results are discussed. Keywords-semantic relatedness, ontology, meronomy I. INTRODUCTION Semantic relatedness measures have become a well known means for measuring the closeness or likeness between concepts in Natural Language Processing and have been implemented in lexical ontologies such as WordNet [1]. The practice of ontology alignment [2] has resulted in many of these lexical-based measures being successfully converted into the realm of domain ontologies, where relationships exist between concepts in lieu of words. However, because the objective is to match concepts [3], this alignment initiative has not encompassed all measures of relatedness. Consequently, more generic measures of relatedness, beyond measuring likeness between concepts, have yet to be established within domain ontologies. The application of methods based on such relationships within domain ontologies has the potential to provide increased insight into how domain concepts within an ontology are and can be related. The term “semantic relatedness” refers to several types of lexical relationships, including synonymy, hyponymy/ hypernymy, meronomy/holonymy, antonymy, as well as any other unsystematic relationships, i.e. functional relationships. The hyponymy relation, also known as the “is-a” relation, is typically seen in a subsumption hierarchy, such as an ontology, and its inverse is known as hypernymy. Any relationship from the group of “component of”, “member of”, and “substance of” relationships can be considered meronomic, and holonymic relationships are their inverses. The antonymic relationship is also known as the “complement of” relation [4]. Concept pairs are considered semantically similar only when any combination of relationships from the group of synonymy/hyponymy/hypernymy hold. To explain how two concepts may be semantically related yet not necessarily similar, Resnik uses an example of a car and gasoline. Resnik [5] states, “for example, cars and gasoline would seem to be more closely related than, say, cars and bicycles, but the latter pair are certainly more similar.” Intuitively, a closer association may be found between gas and car than car and bike. However, using a strictly feature-based comparison, the bicycle is more like, or similar to, the car. A further examination of the semantic relatedness between gas and a car reveals that, when used as a mode of transportation, gas can be considered part of a car. A more obvious example of meronomy is the comparison of a car engine and a car, noting that the engine is part of the car. However, without the engine the car can still be considered a car. Alternatively, a comparison between steel and a car reveals that steel represents a significant portion of the car, since steel is the primary material used in most cars. Intuitively, the conclusion can be drawn that steel has a stronger meronomic relationship to a car than an engine does. Hence, a properly constructed relatedness measure should have the ability to quantify such intuition and evaluate the amount one concept is “part of” another in a domain ontology. II. BACKGROUND A. Types of Relatedness Measures Semantic relatedness measures can be classified within four distinct categories; context vector, feature matching, path distance, and information content (IC). [6] [7] [8]. Context vector measures were introduced by Patwardhan and Pedersen [9] as a means for providing a more general representation of relatedness, though they can be computationally intensive [6]. 978-1-4244-4577-6/09/$25.00 ©2009 IEEE The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009) Cincinnati, Ohio, USA - June 14 - 17, 2009