Optimization of Complex SVM Kernels Using a Hybrid Algorithm Based on Wasp Behaviour Dana Simian, Florin Stoica, and Corina Simian University “Lucian Blaga” of Sibiu, Faculty of Sciences 5-7 dr. I. Rat ¸iu str, 550012 Sibiu, Romˆania Abstract. The aim of this paper is to present a new method for opti- mization of SVM multiple kernels. The kernel substitution can be used to deﬁne many other types of learning machines distinct from SVMs. We introduced a new hybrid method which uses in the ﬁrst level an evo- lutionary algorithm based on wasp behaviour and on the co-mutation operator LR - Mijn and in the second level a SVM algorithm which computes the quality of chromosomes. The most important details of our algorithms are presented. The testing and validation proves that multiple kernels obtained using our genetic approach are improving the classiﬁcation accuracy up to 94.12% for the “leukemia” data set. 1 Introduction Classiﬁcation task is to assign an object to one or several classes, based on a set of attributes. A classiﬁcation task supposes the existence of training and testing data given in the form of data instances. Each instance in the training set con- tains one target value, named class label and several attributes named features. The accuracy of the model for a speciﬁc test set is deﬁned as the percentage of test set items that are correctly classiﬁed by the model. If the accuracy is acceptable, the model can be used to classify items for which the class label is unknown. Two types of approaches for classiﬁcation can be deﬁned: classi- cal statistical approaches (discriminate analysis, generalized linear models) and modern statistical machine learning (neural network, evolutionary algorithms, support vector machines — SVM, belief networks, classiﬁcation trees, Gaus- sian processes). In the recent years, SVMs have become a very popular tool for machine learning tasks and have been successfully applied in classiﬁcation, regression, and novelty detection. Many applications of SVM have been done in various ﬁelds: particle identiﬁcation, face identiﬁcation, text categorization, bioinformatics, database marketing, classiﬁcation of clinical data. The goal of SVM is to produce a model which predicts target value of data instances in the testing set which are given only the attributes. Training involves optimization of a convex cost function. If the data set is separable we obtain an optimal sep- arating hyperplane with a maximal margin (see Vapnik [12]). In the case of non separable data a successful method is the kernel method. Using an appropriate kernel, the data are projected in a space with higher dimension in which they are separable by a hyperplane [2,12]. I. Lirkov, S. Margenov, and J. Wa´ sniewski (Eds.): LSSC 2009, LNCS 5910, pp. 361–368, 2010. c  Springer-Verlag Berlin Heidelberg 2010