Association Rule and Decision Tree based Methods
for Fuzzy Rule Base Generation
Ferenc Peter Pach and Janos Abonyi
Pannon University, Department of Process Engineering,
Veszprem, P.O. Box 158, H-8201, Hungary,
http://www.fmt.vein.hu/softcomp
abonyij@fmt.vein.hu
Abstract— This paper focuses on the data-driven generation
of fuzzy IF...THEN rules. The resulted fuzzy rule base can be
applied to build a classifier, a model used for prediction, or
it can be applied to form a decision support system. Among
the wide range of possible approaches, the decision tree and
the association rule based algorithms are overviewed, and two
new approaches are presented based on the a priori fuzzy
clustering based partitioning of the continuous input variables.
An application study is also presented, where the developed
methods are tested on the well known Wisconsin Breast Cancer
classification problem.
I. I NTRODUCTION
Human logic can be represented well by logical expressions
in syntax of rules, with an antecedent and a consequent part.
A short example can be: If somebody has forgotten her/his
umbrella at home and it is pouring with rain then the chances
are that she/he will be flooding. The set of logical rules is
called rule base that is an easy and useful interpretation of the
knowledge of a given area. ”Various types of logical rules can
be discussed in the context of the decision borders these rules
create in multidimensional feature space. The standard crisp
propositional IF...THEN rules provide overlapping hyperrect-
angular covering areas, threshold logic rules are equivalent to
separating hyperplanes, while fuzzy rules based on real-valued
predicate functions” (come from the prolog to [52]).
Accordingly many rule based methods have been developed
for extraction knowledge from databases. The paper [40]
introduces a genetic programming (GP) and fuzzy logic based
algorithm that extracts explanatory rules from micro array
data. A hybrid approach is proposed in [7], where a standard
GP and a heuristic hierarchical crisp rule-base construction
are combined. A fuzzy mining algorithm based on Srikant and
Agrawals method [48] is proposed for extracting generalized
rules with the use of taxonomies [51]. In [34] compact fuzzy
rules extraction is based on adaptive data approximation using
B-splines.
Rule bases are efficiently used in many area but this paper
concentrates first of all to the prediction applications. Rule
bases are successfully applied for example in stock exchange
estimation [37], weather [32] or future sales forecasting [19].
The high prediction accuracy of the applied model (build
from the extracted rules) is very important but the model
understanding could be also very critical in many areas. It
is very useful to know what are in the background of the
decisions, while rules could be edited or changed by the
specialists of the application area. The compact and appre-
hensible predictive models via the visualization possibilities
could help better human decisions. The paper [52] shows many
computational intelligence techniques (based on decision trees,
neural networks, etc.) that very useful tools to rule extraction
and data understanding.
In developments of the new rule based methods for pre-
diction applications besides the retention and enhancement of
achieved accuracies (in the classification problems), the one
of the most important objects is to enlarge the interpretable
of the rules. To take this aspect into account the one of the
possible improvement ways is the adaptation of fuzzy logic.
Besides the fuzzy methods could represent the discovered rules
far natural for human, the fuzzy logic serves more robust
predictive models (classifiers) in case of false, inconsistent,
and missing data.
In this paper a fuzzy decision tree (Section II-B) and
a fuzzy association rule based method (Section III-B) are
introduced for fuzzy rule base generation. Our main goal
is to show how construct compact fuzzy rule bases which
can be used for data analysis, classification, or prediction.
Therefore prediction accuracy (for classification problems) and
understanding are together in focus during the rule extraction
steps in both algorithms. The classification effectiveness of the
proposed methods are tested on the Wisconsin Breast Cancer
problem. The results are summarized in a short application
study (Section IV).
II. FUZZY DECISION TREE BASED METHODS
A. Existent decision tree induction algorithms
Decision tree based methods are widely used in data mining
and decision support applications. Decision tree is fast and
easy to use for rule generation and classification problems,
moreover it is an excellent representation tool of decisions.
The popularity and the spread of decision tree are based on
the algorithm ID3 by Quinlan [46]. Many studies had been
written to induction and analysis of decision trees [54], [47],
[35], [36], [55]. The application areas of decision trees are
also very breadth [6], [45], [15], [50], [49], [38].
PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY VOLUME 13 MAY 2006 ISSN 1307-6884
PWASET VOLUME 13 MAY 2006 ISSN 1307-6884 45 © 2006 WASET.ORG