62 1541-1672/07/$25.00 © 2007 IEEE IEEE INTELLIGENT SYSTEMS Published by the IEEE Computer Society K n o w l e d g e A c q u i s i t i o n A Heuristic Approach to Learning Rules from Fuzzy Databases José Ranilla and Luis J. Rodríguez-Muñiz, University of Oviedo A cquiring concepts from examples and data mining tasks is a central problem in artificial intelligence. Ross Quinlan identifies five formalisms for approaching the problem: decision trees and rule-production systems, instance-based classifiers, neural networks, genetic algorithms, and statistics. 1 Well-known systems that represent learned knowledge as decision trees or rule sets include ID3, 2 Prism, 3 C4.5, 1 and the AQxx family. 4,5 Other systems move across the boundaries of these formalisms; for instance, Ripper 6 and Fan 7 produce learning rules in instance-based environments. Noise, missing values, or data inconsistencies complicate the concept-acquisition problem. C4.5 and AQ18 4 deal efficiently with such data sets, but ID3 and Prism can cope only with consistent and noise-free data. Nevertheless, fuzzy sets and fuzzy logic can over- come the difficulties commonly reported in apply- ing these classification methods to domains that are vague and ambiguous (see the “Related fuzzy set and classification concepts” sidebar). The research literature reports a considerable number of ID3- based systems 8–10 and a fuzzy version of PRISM. 11 ID3 fuzzy descendants share an entropy-based approach to selecting the more relevant test when building a decision tree, whereas information gain guides the fuzzy version of PRISM to produce rules. As an alternative to approaches based on entropy and information gain, we describe a system that uses a measure called the impurity level. 12 The learning algorithm based on this measure, which we call FARNI, first induces fuzzy decision trees by using an impurity-level extension for selecting the best branch. This is similar to the way C4.5 and ARNI 13 induce selections for crisp databases. Once FARNI calculates the fuzzy decision tree, it returns compact fuzzy rule sets that apply a pruning process inher- ited from Fan. 7 The impurity level A common difficulty in any nontrivial learning problem is selecting a classifier that best covers unseen cases. Using a crisp classifier’s absolute or relative number of right classifications on the learn- ing set is inadequate because it doesn’t account for the number of times the learning algorithm has used the classifier. This approach is also less accurate when noise or data inconsistencies are present. So, a clas- sifier with some classification failures on the learning set isn’t necessarily worse than one with no failures. The impurity-level measurement considers all these aspects of a learning problem. Originally devised as a way to estimate the quality of classifi- cation rules, 12 the impurity level is based on the IB3 learning algorithm’s mechanism for selecting a set of representative instances from a set of training examples. 14 Its first implementation was in Fan and later in systems like ARNI and INNER. 15 To compute a crisp rule’s impurity level, we first calculate the confidence interval of its success prob- ability as follows: A classification measure used successfully with crisp data is extended to deal with cognitive uncertainties in the learning task. Its implementation in an algorithm outperforms similar algorithms in experimental tests.