Type-2 fuzzy logic-based classifier fusion for support vector machines Xiujuan Chen * , Yong Li, Robert Harrison, Yan-Qing Zhang Department of Computer Science, Georgia State University, P.O. Box 3994, Atlanta, GA 30302-3994, USA Received 12 February 2007; accepted 23 February 2007 Available online 23 October 2007 Abstract As a machine-learning tool, support vector machines (SVMs) have been gaining popularity due to their promising performance. However, the generalization abilities of SVMs often rely on whether the selected kernel functions are suitable for real classification data. To lessen the sensitivity of different kernels in SVMs classification and improve SVMs generalization ability, this paper proposes a fuzzy fusion model to combine multiple SVMs classifiers. To better handle uncertainties existing in real classification data and in the membership functions (MFs) in the traditional type-1 fuzzy logic system (FLS), we apply interval type-2 fuzzy sets to construct a type-2 SVMs fusion FLS. This type-2 fusion architecture takes considerations of the classification results from individual SVMs classifiers and generates the combined classification decision as the output. Besides the distances of data examples to SVMs hyperplanes, the type-2 fuzzy SVMs fusion system also considers the accuracy information of individual SVMs. Our experiments show that the type-2 based SVM fusion classifiers outperform individual SVM classifiers in most cases. The experiments also show that the type-2 fuzzy logic-based SVMs fusion model is better than the type-1 based SVM fusion model in general. # 2007 Elsevier B.V. All rights reserved. Keywords: Type-2 FLS; Fuzzy logic; Support vector machines (SVMs); Classifier fusion; Classification; Machine-learning 1. Introduction Support vector machines have been continuously gaining popularity as a machine-learning tool in the fields of pattern recognition and data classification since they have been developed by Vapnik [1] in 1995. Instead of applying empirical risk minimization (ERM) principle commonly used in the statistical learning methods, SVMs employ structural risk minimization (SRM) principle to achieve better generalization ability, the goal of machine-learning tools, than the conven- tional machine-learning algorithms, such as neural networks and decision tree. The foundation of SVMs is based on statistical learning theory. For a binary classification problem with data examples labeled either positive or negative, SVMs aim to find an optimal separating hyperplane which separates the data into the two classes with maximum margin in a high or even infinite transformed feature space, like Hilbert spaces with infinite dimension created by RBF functions. The maximum margin is constituted of a set of positive and negative training examples which are closest to the separating hyperplane called support vectors. The transformation of feature spaces from input spaces can be made through kernel trick, which allows every dot product to be replaced simply by a kernel function. Different kernel functions can be chosen during the SVMs classification, corresponding to the different transformed feature spaces. So kernel functions play an essential role in the SVMs classification since they determine the feature spaces in which data examples are classified and can directly affect the SVMs classification results and performances. When applying SVMs to solve real classification problems, one has to deal with the practical difficulty: how to select an appropriate kernel function which fits particular data better than any other kernel functions. One obvious way is to try many different kernels and choose the one which works best. But this approach could be time-consuming if the size or the number of attributes of training data is huge. Another less time-consuming way is to randomly choose several SVMs with different kernels and build an ensemble model to combine the different SVMs classifiers and generate a composite classifier. The resulting classifier is probably expected to outperform each of its composing single classifiers because different classifiers might complement each other well. Indeed, this advantage of complementation is an important feature of ensemble methods (by combining classifiers) [2]. It is also the significant difference between the ensemble methods and the exhaustive www.elsevier.com/locate/asoc Available online at www.sciencedirect.com Applied Soft Computing 8 (2008) 1222–1231 * Corresponding author. Tel.: +1 404 413 5700; fax: +1 404 413 5717. E-mail address: xchen8@gsu.edu (X. Chen). 1568-4946/$ – see front matter # 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.asoc.2007.02.019