2018 4
th
International Conference for Convergence in Technology (I2CT)
978-1-5386-5232-9/18/$31.00 ©2018 IEEE
Defining Fuzzy Membership Functions for Fuzzy
Data Warehouses
PPG Dinesh Asanka
Department of Computer Science and
Engeenring
University of Moratuwa
Moratuwa, Srilanka
dineshasanka@gmail.com
Amal Shehan Perera
Department of Computer Science and
Engeenring
University of Moratuwa
Moratuwa, Srilanka
shehan@cse.mrt.ac.lk
Abstract— Data warehouse is used in many organizations as a
tool to gain competitive advantage over their competitors.
However, in most data warehouse implementations, lot of
assumptions are made. By making assumptions, outcome of data
warehouse is little far from the truth. Also, these assumptions will
veracity of data warehouse is questionable. To avoid veracity of
data warehouse, fuzzy logic can be incorporated. In fuzzy logic,
fuzzy membership function plays a huge role hence in case of
fuzzy data warehouse, fuzzy membership plays a key role as well.
In data warehouse, different types of fuzzy membership functions
can be introduced. arbitrary, data driven, linguistic, derived,
survey-based membership functions are introduced in this
research paper for different cases in data warehouse.
Keywords— Data Warehousing, Fuzzy, Membership Function
I. INTRODUCTION
Data warehouse has become an important strategic
component in information system of the fiercely competitive
industry. Date warehouse is now spreaded to various sectors
such as Agriculture [1] [2] [3] [4] [5] [6] [7] [8] [9] [10],
Customer Relation Management (CRM) [11], Banking [12]
[13], Healthcare etc. Though the data warehousing is not
considered as an emerging technology, only recently that
industry has adopted data warehousing into their business due
to various issues.
Over the years, analytics of data warehousing is limited to
crisp value analysis. Though there were few attempts made to
introduce veracity aspects of data challenge into the data
warehousing, it is important to emphasis that end to end aspects
of data warehousing is not fully considered due to various
technical and practical limitations.
However, there are attempts made to introduce veracity
aspects to many sectors by using the fuzzy theory. Fuzzy logic
is proposed to mitigate uncertainty in many domains such as
agriculture [14], medicine [15] [16], power systems [17],
production [18], sports [19], transportation [20] etc.
In the field of data warehousing, for analytics purposes
crisp values are used. For example, when there is a need to
analyze some measures (assume sales) with the age of the
customer. Customer age can be configured as nominal values
such as low, middle and high. Depending on the domain and
the situation, ranges for low, middle and high will be different.
When the nominal labels are used, it is obvious that all the
values in the range when the analysis was made from nominal
values is not correct. To introduce veracity, fuzzy logic can be
used. For example, 30 years of age can be considered as 0.3,
0.7 weightages are configured for medium and low respectively
whereas in case of crisp set analyze, 30 years of age will be
labeled as Medium and only Medium. By doing this, young or
old contribution of the age is ignored.
In this research paper, design strategies for data warehouse
is introduced by looking at various aspects of data warehouses
such as dimensions and fact tables and different scenarios.
In this research paper, current research status of the fuzzy
data warehouse is discussed in the State of Art – Fuzzy Data
warehouse. Methodology is discussed in the following section
while configuring of fuzzy membership is discussed in the next
section. In the next two sections, fuzzy data warehouse design
is discussed in detail. Finally, the conclusion and future work is
discussed.
II. STATE OF ART – FUZZY DATA WAREHOUSE
As discussed in the introduction section, there are lot of
domains which have discussed on usage of data warehouse.
Since most of the research papers are published in recent years,
it can be concluded that data warehousing is still a popular and
used technology in the industry today. Also, fuzzy logic is also
not a novel to industry as there are few fuzzy logic
implementations are made as discussed in the previous section.
In this section, current research status was identified with
respect to all relevant areas. Literature review was divided into
main two areas, fuzzy databases and fuzzy data warehouse
design.
Since many techniques of fuzzy databases can be utilized
into the data warehouse, it was decided to analyses research
areas of fuzzy databases. Unlike fuzzy data warehousing, there
are few attempts made to design fuzzy databases in the
relational and transactional databases. Research paper titled A
Fuzzy Representation of Data for Relational Databases [21] it
has suggested relational algebra operations consists of the same
four parts as traditional relational algebra operation. To prove
the concepts, this paper has come up with an implementation to
selection of baseball team. In case of baseball team selection,
there are expert places for each place. Some players are better