1
Copyright © 2017, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Chapter 1
DOI: 10.4018/978-1-5225-1008-6.ch001
ABSTRACT
This paper is intended to design a fuzzy based approach to assess standards and quality of big data. It
also serves as a platform to organizations that intend to migrate their existing database environment to
big data environment. Data is assessed using a multidimensional approach based on quality factors like
accuracy, completeness, reliability, usability, etc. These factors are analysed by constructing decision
trees to identify the quality aspects which need to be improved. In this work fuzzy queries have been
designed. The queries are grouped as sets namely Excellent, Optimal, Fair and Hybrid. Based on the
fuzzy data sets formed and the query compatibility index, a query set is chosen. A data set that has a
very high degree of membership is assigned a fair query set. A data set with a medium degree of mem-
bership is assigned a optimal query set. A data set that has a lesser degree of membership is assigned a
Excellent query set. A data set which needs a combination of queries of all the above is assigned a hybrid
query set. The fuzzy query based approach reduces the query compatibility index by 36%, compared to
a normal query set approach.
INTRODUCTION
In today’s world with an increase in the amount of data processing and information requirement it is
essential to develop strategies to effectively manage and assess the data for essential quality checks. The
database forms the basis of day to day decisions taken by the organization. Data obtained from employees
need to be periodically updated for effective utilization. In this work, an attempt has been made to assess
data quality based on certain measures or parameters like Accuracy, Completeness, Reliability, Usability,
etc as discussed by Pradheep et al in (2014). Based on these parameters the data set is queried to assess
Fuzzy-Based Querying
Approach for Multidimensional
Big Data Quality Assessment
Pradheep Kumar K.
BITS Pilani, India
Venkata Subramanian D.
Hindustan Institute of Technology & Science, India