Ontology-based Integration and Retrieval over Multiple Quantities
— What if “Ovate leaves and often blue to purple flowers”
Shenghui Wang
Department of Computer Science
Vrije Universiteit Amsterdam
The Netherlands
Jeff Z. Pan
Department of Computing Science
University of Aberdeen
United Kingdom
Abstract
Information integration and retrieval have been impor-
tant problems for many information systems — it is hard
to combine multidimensional and parallel information and
make them available for application queries. In our pre-
vious work [12], we have shown how to use ontologies to
facilitate integrating and querying parallel but single di-
mensional information. In this paper, we further investigate
how to take advantage of ontologies to facilitate integrating
parallel information and querying over multiple quantities.
1 Introduction
Information integration and retrieval have been impor-
tant problems for many information systems, including
those based on the Web [9] — it is hard to combine in-
formation from different sources and make them available
for application queries. In this paper, we focus on descrip-
tive domains, where most information is mostly available
in natural language (NL) form and comes parallel, i.e., the
same objects or phenomena are described in multiple free-
styled documents [3]. To some extent, the Web itself is
a huge source of parallel descriptions. It has been argued
in [13] that NLs are not adept at describing these continu-
ous quantities precisely. Therefore, automated information
processing in descriptive domains suffers from the lack of
techniques to capture the semantics of natural language de-
scriptions precisely and represent them properly.
Recently, W3C standardised the OWL Web Ontology
Language [1] in its Semantic Web Activity. With ontolo-
gies being shared understandings of application domains,
ontology-based integration and retrieval [10] is a promis-
ing direction. In our previous work [12], we have shown
how to use ontologies to facilitate integrating and query-
ing parallel but single quantity information (shape descrip-
tions). More specifically, parallel shape descriptions can be
extracted and represented in a uniform ontology, the explic-
itly written information can be accessed easily and the im-
plicit knowledge can also be deduced naturally by applying
reasoning on the whole ontology. In this paper, we further
investigate the following issues that are related to how to
take advantage of ontology to facilitate integrating parallel
information and querying over multiple quantities. Like in
[12], we choose botany as our application domain as it is
one of the premier descriptive sciences and offers a wealth
of material on which to evaluate our approach. In particular,
we consider parallel colour and leaf shape descriptions in
our ontology, which is an extension of the ones that we used
in [11, 12, 13].
For example, the colour of flowers of species Paeonia
anomala is described in two floras:
• purple-pink — in Ornamental Plants From Russia,
• rose to red, occasionally nearly white — in Flora of China;
while its leaf shape is also described differently as
• lanceolate — in Ornamental Plants From Russia,
• linear to linear-lanceolate — in Flora of China.
Being able to handle each quantity as a separate dimen-
sion is simply the first step. With multiple quantities in our
ontology, we can ask many interesting questions. For ex-
ample, one user may ask the plant knowledge base: which
species definitely have “linear” leaves and more or less
“bluish-purple” flowers, blooming in early spring across the
British Isles? English bluebell satisfies this query, but are
there any other species having similar morphological fea-
tures?
The contributions of this paper include solutions to the
following issues related to multidimensional integration and
querying:
1. We focus on the semantics of natural language de-
scriptions with frequency information, such as “some-
times,” “rarely,” etc. and its representation in an ontol-
ogy system.
2007 IEEE/WIC/ACM International Conference on Web Intelligence
0-7695-3026-5/07 $25.00 © 2007 IEEE
DOI 10.1109/WI.2007.63
388
2007 IEEE/WIC/ACM International Conference on Web Intelligence
0-7695-3026-5/07 $25.00 © 2007 IEEE
DOI 10.1109/WI.2007.63
388
2007 IEEE/WIC/ACM International Conference on Web Intelligence
0-7695-3026-5/07 $25.00 © 2007 IEEE
DOI 10.1109/WI.2007.63
388
2007 IEEE/WIC/ACM International Conference on Web Intelligence
0-7695-3026-5/07 $25.00 © 2007 IEEE
DOI 10.1109/WI.2007.63
388
2007 IEEE/WIC/ACM International Conference on Web Intelligence
0-7695-3026-5/07 $25.00 © 2007 IEEE
DOI 10.1109/WI.2007.63
388