Sugeno fuzzy integral for finding fuzzy if–then classification rules Yi-Chung Hu Department of Business Administration, Chung Yuan Christian University, 200, Chung Pei Road, Chung-Li 320, Taiwan, ROC Abstract It is known that data mining techniques can be used to discover useful information by exploring and analyzing data. For classification problems, this paper uses the Sugeno fuzzy integral to determine the degrees of importance for individual fuzzy grids that are generated by partitioning each data attribute with various linguistic values; then, fuzzy if–then classi- fication rules are discovered from those fuzzy grids whose degree of importance is larger than or equal to a user-specified minimum threshold. In the proposed method, since it is difficult for users to specify partition numbers in quantitative attri- butes, the degree of importance for each training pattern, and user-specified minimum thresholds, the aforementioned parameter specifications are determined by evolutionary computations of genetic algorithms (GA). For examining the gen- eralization ability, the simulation results from the iris data and the appendicitis data show that the proposed method per- forms well in comparison with many well-known classification methods. Ó 2006 Elsevier Inc. All rights reserved. Keywords: Fuzzy sets; Genetic algorithms; Fuzzy integral; Data mining; Classification problems 1. Introduction Data mining is the exploration and analysis of data in order to discover meaningful patterns [3]. The aim of this paper is to propose an effective method that can find a compact set of fuzzy if–then rules for classification problems by using the Sugeno fuzzy integral [28,29] and genetic algorithms (GA) [7]. It should be noted that the consequent part of a fuzzy classification rule is a class label. This paper proposes a fuzzy data mining technique to discover fuzzy if–then rules for classification prob- lems based on the well-known Apriori algorithm [1]. At first, frequent fuzzy grids are found by dividing each quantitative attribute with a pre-specified number of various linguistic values. Subsequently, effective fuzzy classification rules are generated from those frequent fuzzy grids. The fuzzy support and the fuzzy confidence are defined to determine which fuzzy grids are frequent and which rules are effective by comparison with the minimum fuzzy support (min FS) and the minimum fuzzy confidence (min FC), respectively. In particular, the fuzzy support of each fuzzy grid is determined by the Sugeno fuzzy integral. 0096-3003/$ - see front matter Ó 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2006.07.010 E-mail address: ychu@cycu.edu.tw Applied Mathematics and Computation 185 (2007) 72–83 www.elsevier.com/locate/amc