DOI: 10.4018/IJBAN.2017010103
Copyright © 2017, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
International Journal of Business Analytics
Volume 4 • Issue 1 • January-March 2017
ABSTRACT
View selection deals with the selection of appropriate sets of views capable of improving the response
times for queries while conforming to space constraints. Materializing all views is infeasible, as the
number of possible views is exponential with respect to the number of dimensions and, thus, would
not fit within the available storage space. Further, optimal view selection is an NP-Complete problem.
Thus, the only remaining alternative is to select a subset of views that reduce the query response time
and fit within the available space for materialization. The most fundamental greedy view selection
algorithm HRUA considers the size parameter for computing the Top-K views for materialization. In
each iteration, it computes the benefit, with respect to size, of all non-selected views, and selects the
one entailing the highest benefit for materialization. Though these selected views may be beneficial
in respect of their size, they may not be capable of answering large numbers of future queries thereby
becoming an unnecessary space overhead. Existing query frequency based view selection algorithms,
which address this problem, have been compared in this paper. Experimental results show that each
of these algorithms, in comparison to HRUA, are able to select fairly good quality views that provide
answers to comparatively greater numbers of queries. Materializing these selected views would
facilitate the business decision making process.
KeywoRdS
Analytical Queries, Decision Making, Greedy Algorithms, Materialized View Selection, Warehouse
Query Frequency based View Selection
Mohammad Haider, Saudi Electronic University, Dammam, Saudi Arabia
T.V. Vijay Kumar, Jawaharlal Nehru University, New Delhi, India
INTRodUCTIoN
Globalization of businesses has led to voluminous amount of data being generated continuously over
time. In this age of ever-changing data and a wants-driven economy, readily available and updated
information plays a vital role in the formulation of optimal business strategies for gaining competitive
advantage. To be, and remain, competitive in today’s volatile market, considerable efforts are required
like conducting market research for identifying customer demands as against their needs. Exponential
growth in the areas of information technology and information processing has been observed in
the last few decades. Proper and timely availability of this processed information holds the key for
businesses to survive. In order to meet this demand for information, the capture, and efficient storage,
of the turbulent data that is to be processed for the purpose of analysis, should be the major focus.
Such processed data generally proves useful for knowledge workers and/or decision makers in the
decision making process. Availability of such data shall provide the business houses a substantive
edge over their competitors.
With the advent of the era of technological enhancement in areas of software, analytics, hardware
capabilities and data communication, most organizations have collected massive amounts of raw data.
As a result, although most such organizations are data rich, they are lacking in cogent information (Gray
& Watson, 1998; Han & Kamber, 2000) leading to valuable information getting lost inside humongous
36