Semantic Matching to Achieve Web Service Discovery and Composition
Rama Akkiraju
1
, Biplav Srivastava
2
, Anca Ivan
1
, Richard Goodwin
1
, Tanveer Syeda-Mahmood
3
1
IBM T. J. Watson Research Center, 19 Skyline Drive, Hawthorne, NY, 10532, USA
2
IBM India Research Laboratory, Block 1, IIT Campus, Hauz Khaus, New Delhi, 11016, India
3
IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120, USA
{akkiraju@us, sbiplav@in, ancaivan@us, rgoodwin@us, stf@almaden}.ibm.com
Abstract
In this paper, we present a novel algorithm to discover
and compose web services in the presence of semantic
ambiguity by combining semantic matching and AI
planning algorithms. Specifically, we use cues from
domain-independent and domain-specific ontologies to
compute an overall semantic similarity score between
ambiguous terms. This semantic similarity score is used
by AI planning algorithms to guide the searching process
when composing services. In addition, we integrate
semantic and ontological matching with an indexing
method, which we call attribute hashing, to enable fast
lookup of semantically related concepts.
1. Introduction
In implementing service-oriented architectures, Web
services are becoming an important technological
component. Web services matching and composition has
become a topic of increasing interest in the recent years
with the gaining popularity of Web services. Two main
directions have emerged. The first direction investigated
the application of AI planning algorithms to compose. The
second direction explored the application of information
retrieval techniques. However, to the best of our
knowledge, these two techniques have not been combined
to achieve compositional matching in the presence of
inexact terms, and thus improve recall. In this paper, we
present a novel approach to compose Web services in the
presence of semantic ambiguity using a combination of
semantic matching and AI planning algorithms.
Specifically, we use domain-independent and domain-
specific ontologies to determine the semantic similarity
between ambiguous concepts/terms. The domain-
independent relationships are derived using an English
thesaurus after tokenization and part-of-speech tagging.
The domain-specific ontological similarity is derived by
inferring the semantic annotations associated with Web
service descriptions using ontology. Matches due to the
two cues are combined to determine an overall similarity
score. This semantic similarity score is used by AI
planning algorithms in composing services. By combining
semantic scores with planning algorithms we show that
better results can be achieved than the ones obtained using
a planner or matching alone.
In the remainder of the paper, we start with a scenario to
illustrate the need for Web services composition in open
business domains and discuss how our approach can help
in resolving the semantic ambiguities better. We then give
details of the SEMAPLAN system and discuss how the
engine was customized for the IEEE WS Challenge.
1. A Motivating Scenario
In this section, we present a scenario from the knowledge
management domain to illustrate the need for (semi)
automatic composition of Web services. For example, if a
user would like to identify names of authors in a given
document, text annotators such as a Tokenizer, which
identifies tokens, a LexicalAnalyzer, which identifies parts
of speech, and a NamedEntityRecognizer, which identifies
references to people and things etc. could be composed to
meet the request. The following figure summarizes this
composition flow.
Figure 1. Example of a composition of Web services
In this example, the term lexemeAttr may not match with
lemmaProp unless the word is split into lexeme and Attr
and matched separately. Using a linguistic domain
ontology one can infer that lemma could be considered a
match to the term lexeme. Abbreviation expansion rule
can be applied to the terms Attr and Prop to expand them
to Attribute and Property. Then a consultation with a
domain-independent thesaurus such as WordNet [6]
Matched Services
Request
Any
Service or Service
Combinations
Text
Named Entity
Analyzer
Recognizer
Doc
Tokens
Lemma
Prop
Canonical
String
Canonical
Category
Named Entity
subClassOf
Text
subClassOf
Lexeme
Attr
~=
CanStr
~=
Tokenizer
Lexical
Named
Entity
~=
Proceedings of the 8th IEEE International Conference on E-Commerce Technology and the 3rd IEEE
International Conference on Enterprise Computing, E-Commerce, and E-Services (CEC/EEE’06)
0-7695-2511-3/06 $20.00 © 2006 IEEE