Type-2 fuzzy logic-based classiﬁer fusion for support vector machines Xiujuan Chen * , Yong Li, Robert Harrison, Yan-Qing Zhang Department of Computer Science, Georgia State University, P.O. Box 3994, Atlanta, GA 30302-3994, USA Received 12 February 2007; accepted 23 February 2007 Available online 23 October 2007 Abstract As a machine-learning tool, support vector machines (SVMs) have been gaining popularity due to their promising performance. However, the generalization abilities of SVMs often rely on whether the selected kernel functions are suitable for real classiﬁcation data. To lessen the sensitivity of different kernels in SVMs classiﬁcation and improve SVMs generalization ability, this paper proposes a fuzzy fusion model to combine multiple SVMs classiﬁers. To better handle uncertainties existing in real classiﬁcation data and in the membership functions (MFs) in the traditional type-1 fuzzy logic system (FLS), we apply interval type-2 fuzzy sets to construct a type-2 SVMs fusion FLS. This type-2 fusion architecture takes considerations of the classiﬁcation results from individual SVMs classiﬁers and generates the combined classiﬁcation decision as the output. Besides the distances of data examples to SVMs hyperplanes, the type-2 fuzzy SVMs fusion system also considers the accuracy information of individual SVMs. Our experiments show that the type-2 based SVM fusion classiﬁers outperform individual SVM classiﬁers in most cases. The experiments also show that the type-2 fuzzy logic-based SVMs fusion model is better than the type-1 based SVM fusion model in general. # 2007 Elsevier B.V. All rights reserved. Keywords: Type-2 FLS; Fuzzy logic; Support vector machines (SVMs); Classiﬁer fusion; Classiﬁcation; Machine-learning 1. Introduction Support vector machines have been continuously gaining popularity as a machine-learning tool in the ﬁelds of pattern recognition and data classiﬁcation since they have been developed by Vapnik [1] in 1995. Instead of applying empirical risk minimization (ERM) principle commonly used in the statistical learning methods, SVMs employ structural risk minimization (SRM) principle to achieve better generalization ability, the goal of machine-learning tools, than the conven- tional machine-learning algorithms, such as neural networks and decision tree. The foundation of SVMs is based on statistical learning theory. For a binary classiﬁcation problem with data examples labeled either positive or negative, SVMs aim to ﬁnd an optimal separating hyperplane which separates the data into the two classes with maximum margin in a high or even inﬁnite transformed feature space, like Hilbert spaces with inﬁnite dimension created by RBF functions. The maximum margin is constituted of a set of positive and negative training examples which are closest to the separating hyperplane called support vectors. The transformation of feature spaces from input spaces can be made through kernel trick, which allows every dot product to be replaced simply by a kernel function. Different kernel functions can be chosen during the SVMs classiﬁcation, corresponding to the different transformed feature spaces. So kernel functions play an essential role in the SVMs classiﬁcation since they determine the feature spaces in which data examples are classiﬁed and can directly affect the SVMs classiﬁcation results and performances. When applying SVMs to solve real classiﬁcation problems, one has to deal with the practical difﬁculty: how to select an appropriate kernel function which ﬁts particular data better than any other kernel functions. One obvious way is to try many different kernels and choose the one which works best. But this approach could be time-consuming if the size or the number of attributes of training data is huge. Another less time-consuming way is to randomly choose several SVMs with different kernels and build an ensemble model to combine the different SVMs classiﬁers and generate a composite classiﬁer. The resulting classiﬁer is probably expected to outperform each of its composing single classiﬁers because different classiﬁers might complement each other well. Indeed, this advantage of complementation is an important feature of ensemble methods (by combining classiﬁers) [2]. It is also the signiﬁcant difference between the ensemble methods and the exhaustive www.elsevier.com/locate/asoc Available online at www.sciencedirect.com Applied Soft Computing 8 (2008) 1222–1231 * Corresponding author. Tel.: +1 404 413 5700; fax: +1 404 413 5717. E-mail address: xchen8@gsu.edu (X. Chen). 1568-4946/$ – see front matter # 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.asoc.2007.02.019