Journal of Database Management, 20(4), 1-25, October-December 2009 1
Copyright © 2009, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global
is prohibited.
Keywords: Cost Model, Query-Mapping, Query Processing, Relational Databases, Top-K Query, Tradeoff
Analysis, Uncertainty Modeling
INTrODuCTION
Relational databases are increasingly being
used to support a wide range of interactive
applications that require efficient methods for
exploratory search and retrieval (for example,
search for airline tickets, hotel rooms, real
estates, used cars). In such applications, us-
ers specify target values for certain attributes
without necessarily requiring exact matches
to these values in return. However, relational
queries normally establish rigid qualification to
deal only with data that exactly match selection
conditions (Motro, 1988). Due to the exactness
in nature of relational databases and the query
A Cost-Based range
Estimation for Mapping
Top-k Selection Queries
over relational Databases
Anteneh Ayanso, Brock University, Canada
Paulo B. Goes, University of Arizona, USA
Kumar Mehta, George Mason University, USA
ABSTrACT
Finding effcient methods for supporting top-k relational queries has received signifcant attention in academic
research. One of the approaches in the recent literature is query-mapping, in which top-k queries are mapped
(translated) into equivalent range queries that relational database systems (RDBMSs) normally support. This
approach combines the advantage of simplicity as well as practicality by avoiding the need for modifcations
to the query engine, or specialized data structures or indexing techniques to handle top-k queries separately.
However, existing methods following this approach fall short of adequately modeling the problem environment
and providing consistent results. In this article, the authors propose a cost-based range estimation model for
the query-mapping approach. They provide a methodology for trading-off relevant query execution cost com-
ponents and mapping a top-k query into a cost-optimal range query for effcient execution. Their experiments
on real world and synthetic data sets show that the proposed strategy not only avoids the need to calibrate
workloads on specifc database contents, but also performs at least as well as prior methods.
DOI: 10.4018/jdm.2009062501
IGI PUBLISHING
This paper appears in the publication, Journal of Database Management, Volume 20, Issue 4
edited by Keng Siau © 2009, IGI Global
701 E. Chocolate Avenue, Hershey PA 17033-1240, USA
Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.igi-global.com
ITJ 5257