Multiclass SVM Model Selection Using Particle Swarm Optimization Bruno Feres de Souza, Andr´ e C.P.L.F. de Carvalho, Rodrigo Calvo and Renato Porf´ ırio Ishii Institute of Mathematical and Computer Sciences, University of Sao Paulo Av. Trabalhador Sao-Carlense, 400, Sao Carlos, SP, Brazil {bferes,andre,rcalvo,rpi}@icmc.usp.br Abstract Tuning SVM hyperparameters is an important step for achieving good classification performance. In the binary case, the model selection issue is well studied. For multi- class problems, it is harder to choose appropriate values for the base binary models of a decomposition scheme. In this paper, the authors employ Particle Swarm Optimization to perform a multiclass model selection, which optimizes the hyperparameters considering both local and global models. Experiments conducted over 4 benchmark problems show promising results. 1 Introduction The determination of the optimal values for the hyperpa- rameters (regularization term C plus kernel parameters) of Support Vector Machines (SVMs) [19] is known as model selection. For the binary case, the issue is well established and studied by now [7][1]. For the multiclass counterpart, there is a research effort to develop efficient methods to deal with the problem [12][14][15]. The most used approach for multiclass SVMs model se- lection is based on Grid search [9]. It has two versions. The first globally applies the same hyperparameters values to all binary SVMs of the decomposition. The second locally ap- plies different values of hyperparameters to different binary SVMs, independently. Both approaches have drawbacks. For example, the same set of values for all classifiers may be sub-optimal in some situations. On the other hand, opti- mizing very well each classifier individually does not ensure that they will perform well together. A better way for model selection would be allowing the binary SVMs to have different values for their hyperparam- eters while also considering the classification performance of the whole multiclass problem. Unfortunately, the number of possible combinations of hyperparameters values in this case is large and the computational burden of the model se- lection would be too high, since ( h i=1 d i ) n should be trained, where n is the number of SVMs of the decomposition, h is the number of hyperparameters to be tuned for each SVM and d i is the number of values the hyperparameter d can take. The use of heuristic optimization methods may avoid an exhaustive search throughout the hyperparameters space. In this paper, the authors apply the Particle Swarm Op- timization algorithm (PSO) [13] to efficiently deal with the model selection problem for multiclass SVMs with the RBF kernel. The method is able to efficiently tune multiple SVM hyperparameters in simultaneous local and global fashions, i.e., the parameters of the classification algorithm are opti- mized considering both the individual components and the interactions between the parts. The paper is organized as follows. Section 2 briefly in- troduces SVMs for binary and multiclass cases. Besides, a method for estimating SVM generalization error is re- viewed. Section 3 presents relevant work on SVM model selection and some of the drawbacks are discussed. Sec- tion 4 describes the proposed PSO-based method for model selection. Section 5 shows the results of the experimental study. Finally, Section 6 draws some conclusions. 2 Support Vector Machines 2.1 Binary SVMs SVMs constitute a new class of learning algorithms that have exhibited good performance on a large range of ap- plications [19]. In the simplest case of binary classifica- tion, they work by constructing a hyperplane that maxi- mizes the margin of separation between the examples of the two classes. By doing so, they implement the principle of Structural Risk Minimization, which deals with a tradeoff between the empirical risk (commonly referred as training error) and the classifier complexity, in order to minimize a theoretical bound on the generalization error of the classi- fier. A comprehensive introduction to SVMs is in [19]. Proceedings of the Sixth International Conference on Hybrid Intelligent Systems (HIS'06) 0-7695-2662-4/06 $20.00 © 2006