Sundarapandian et al. (Eds) : ACITY, AIAA, CNSA, DPPR, NeCoM, WeST, DMS, P2PTM, VLSI - 2013
pp. 147–154, 2013. © CS & IT-CSCP 2013 DOI : 10.5121/csit.2013.3415
Ahmed Elsayed
1
, Ahmed Sharaf Eldin
1
, Doaa S. El Zanfaly
1, 2
1
Information Systems Department, Faculty of Computers and Information,
Helwan University, Cairo, Egypt
eng_ahmedyakoup@yahoo.com, profase2000@yahoo.com,
doasad71@yahoo.com
2
Informatics and Computer Scinence, British University in Egypt, Cairo, Egypt
doaa.elzanfaly@bue.edu.eg
ABSTRACT
Keyword Search Over Relational Databases (KSORDB) provides an easy way for casual users
to access relational databases using a set of keywords. Although much research has been done
and several prototypes have been developed recently, most of this research implements exact
(also called syntactic or keyword) match. So, if there is a vocabulary mismatch, the user cannot
get an answer although the database may contain relevant data. In this paper we propose a
system that overcomes this issue. Our system extends existing schema-free KSORDB systems
with semantic match features. So, if there were no or very few answers, our system exploits
domain ontology to progressively return related terms that can be used to retrieve more
relevant answers to user.
1. INTRODUCTION
A significant amount of plain text and structured data has been stored side by side in relational
databases for decades. In order to query this data, users have to know the database schema and
then use Structure Query Language (SQL) to issue a precise, unambiguous and well-formed
query. To relieve users from doing this, Keyword Search over Relational Database (KSORDB)
enables casual users to query this data using a set of keywords called keyword query. Existing
KSORDB approaches can be categorized into two main categories: Schema-based approaches
[1-3] and Schema-free approach [4-7]. In order to process a keyword query, the schema-based
approach uses the schema graph to generate Candidate Network (CNs) and then evaluate these
networks using SQL queries. While in the second approach, schema-free, the database is modeled
as a data graph where nodes are tuples and edges are foreign - primary key relationships. The
graph is then traversed, at the query time, to answer keyword queries.
The main difficulties with the schema-based KSORDB systems are in generating optimal SQL
queries from a huge number of CNs. Moreover, the generated queries usually contain many join