I.-Y. Song et al. (Eds.): ER 2003, LNCS 2813, pp. 476–489, 2003.
© Springer-Verlag Berlin Heidelberg 2003
A Heuristic-Based Methodology for Semantic
Augmentation of User Queries on the Web
*
Andrew Burton-Jones
1
, Veda C. Storey
1
, Vijayan Sugumaran
2
, and Sandeep Purao
3
1
J. Mack Robinson College of Business, Georgia State University,
Atlanta, GA 30302,
{vstorey,abjones}@gsu.edu
2
School of Business Administration, Oakland University
Rochester, MI 48309
sugumara@oakland.edu
3
School of Information Sciences & Technology, The Pennsylvania State University,
University Park, PA 16801-3857
spurao@ist.psu.edu
Abstract. As the World Wide Web continues to grow, so does the need for
effective approaches to processing users’ queries that retrieve the most relevant
information. Most search engines provide the user with many web pages, but at
varying levels of relevancy. The Semantic Web has been proposed to retrieve
and use more semantic information from the web. However, the capture and
processing of semantic information is a difficult task because of the well-known
problems that machines have with processing semantics. This research proposes
a heuristic-based methodology for building context aware web queries. The
methodology expands a user’s query to identify possible word senses and then
makes the query more relevant by restricting it using relevant information from
the WordNet lexicon and the DARPA DAML library of domain ontologies. The
methodology is implemented in a prototype. Initial testing of the prototype and
comparison to results obtained from Google show that this heuristic based
approach to processing queries can provide more relevant results to users,
especially when query terms are ambiguous and/or when the methodology’s
heuristics are invoked.
1 Introduction
It is increasingly difficult to retrieve relevant web pages for queries from the World
Wide Web due to its rapid growth and lack of structure [31, 32]. In response, the
Semantic Web has been proposed to extend the WWW by giving information well-
defined meaning [3, 10]. The Semantic Web relies heavily on ontologies to provide
taxonomies of domain specific terms and inference rules that serve as surrogates for
semantics [3]. Berners-Lee et al. describe the Semantic Web as “not a new web but an
extension of the current one, in which information is given well-defined meaning”
[3]. Unfortunately, it is difficult to capture and represent meaning in machine-
*
This research was partially supported by J. Mack Robinson College of Business, Georgia
State University and Office of Research & Graduate Study, Oakland University.