Abstract—Evolutionary algorithms have been actively applied to knowledge discovery, data mining and machine learning under the name of genetics-based machine learning (GBML). The main advantage of using evolutionary algorithms in those application areas is their flexibility: Various knowledge extraction criteria such as accuracy and complexity can be easily utilized as fitness functions. On the other hand, the main disadvantage is their large computation load. It is not easy to apply evolutionary algorithms to large data sets. The scalability improvement to large data sets is one of the main research issues in GBML. In our former studies, we proposed an idea of parallel distributed implementation of GBML and examined its effectiveness for genetic fuzzy rule selection. The point of our idea was to realize a quadratic speed-up by dividing not only a population but also training data. Training data subsets were periodically rotated over sub-populations in order to prevent each sub-population from over-fitting to a specific training data subset. In this paper, we propose the use of parallel distributed implementation for the design of ensemble classifiers. An ensemble classifier is designed by combining base classifiers, each of which is obtained from each sub-population. Through computational experiments on parallel distributed genetic fuzzy rule selection, we examine the generalization ability of designed ensemble classifiers under various settings with respect to the size of training data subsets and their rotation frequency. I. INTRODUCTION UZZY rule-based classifiers can be a promising knowledge representation framework for classification problems since each fuzzy rule is linguistically interpretable (e.g., If x 1 is large and x 2 is small then Class 1). Evolutionary algorithms have been frequently used for the design of fuzzy rule-based classifiers under the name of fuzzy genetics-based machine learning (fuzzy GBML), genetic fuzzy systems (GFS) and evolutionary fuzzy systems (EFS) [1, 2]. One hot research issue in the field of fuzzy GBML (and GBML in general) is the scalability improvement of evolutionary algorithms to large data sets. This is because even an evaluation of a single fuzzy rule-based classifier needs a long computation time in the case of large data sets. Since the execution of a fuzzy GBML algorithm involves tens of thousands of evaluations of fuzzy rule-based classifiers, its application to large data sets is very difficult. This work was supported in part by the Japan Society for Young Scientist (B): KAKENHI (22700239). Y. Nojima, S. Mihara, and H. Ishibuchi are with the Department of Computer Science and Intelligent Systems, Osaka Prefecture University, Sakai, Osaka 599-8531, JAPAN. (phone: +81-72-254-9198; fax: +81-72- 254-9915; e-mail: nojima@cs.osakafu-u.ac.jp, mihara@ci.cs.osakafu-u.ac.jp, hisaoi@cs.osakafu-u.ac.jp). One approach for decreasing the computation time of evolutionary algorithms is their parallel implementation [4, 5]. When we use an island model on a multi-core computer, the number of islands (i.e., the number of sub-populations) is usually the same as the number of CPU cores. Let N CPU be the number of sub-populations (i.e., the number of CPU cores). The computation time of an evolutionary algorithm can be potentially decreased through its parallel implementation to 1/N CPU in comparison with its non-parallel implementation. In the fields of knowledge discovery, data mining and machine learning, data reduction such as feature and instance selection [6-9] is a well-known and frequently-used approach to the scalability improvement of knowledge extraction methods to large data sets. In our former studies [10-12], we proposed an idea of parallel distributed implementation of genetic fuzzy rule selection. The point of our idea was to realize a quadratic speed-up by simultaneously utilizing the two scalability improvement approaches mentioned above. Specifically saying, we used parallel implementation of genetic fuzzy rule selection [13] together with data reduction. A population was divided into a number of sub-populations for parallel implementation while training data were divided into multiple training data subsets for data reduction. A single training data subset and a single sub-population were assigned to each CPU core in a multi-core CPU computer. In order to prevent each sub-population from over-fitting to a specific training data subset, training data subsets were periodically rotated over sub-populations (e.g., every 100 generations). It was shown in [10] that a parallel distributed implementation of genetic fuzzy rule selection decreased its computation time to 1/9 (i.e., 1/N CPU 2 ) in comparison with its non-parallel implementation without any clear deterioration in the generalization ability of obtained fuzzy rule-based classifiers. Moreover, we proposed the use of very small training data subsets in our parallel distributed genetic fuzzy rule selection in [12] where the number of training data subsets was much larger than the number of CPU cores. This means that only a small portion of the available training data was used for genetic fuzzy rule selection at each generation. The computation time of genetic fuzzy rule selection was further reduced by this idea. The use of multiple classifiers as an ensemble is one of the most promising approaches to the design of reliable classifies with high generalization ability [14]. Especially bagging is a well-known and frequently-used method [15]. There are several recent studies on GFS with bagging [16-18]. In Ensemble Classifier Design by Parallel Distributed Implementation of Genetic Fuzzy Rule Selection for Large Data Sets Yusuke Nojima, Member, IEEE, Shingo Mihara, and Hisao Ishibuchi, Senior Member, IEEE F