Multiclass SVM Model Selection Using Particle Swarm Optimization
Bruno Feres de Souza, Andr´ e C.P.L.F. de Carvalho, Rodrigo Calvo and Renato Porf´ ırio Ishii
Institute of Mathematical and Computer Sciences, University of Sao Paulo
Av. Trabalhador Sao-Carlense, 400, Sao Carlos, SP, Brazil
{bferes,andre,rcalvo,rpi}@icmc.usp.br
Abstract
Tuning SVM hyperparameters is an important step for
achieving good classification performance. In the binary
case, the model selection issue is well studied. For multi-
class problems, it is harder to choose appropriate values
for the base binary models of a decomposition scheme. In
this paper, the authors employ Particle Swarm Optimization
to perform a multiclass model selection, which optimizes the
hyperparameters considering both local and global models.
Experiments conducted over 4 benchmark problems show
promising results.
1 Introduction
The determination of the optimal values for the hyperpa-
rameters (regularization term C plus kernel parameters) of
Support Vector Machines (SVMs) [19] is known as model
selection. For the binary case, the issue is well established
and studied by now [7][1]. For the multiclass counterpart,
there is a research effort to develop efficient methods to deal
with the problem [12][14][15].
The most used approach for multiclass SVMs model se-
lection is based on Grid search [9]. It has two versions. The
first globally applies the same hyperparameters values to all
binary SVMs of the decomposition. The second locally ap-
plies different values of hyperparameters to different binary
SVMs, independently. Both approaches have drawbacks.
For example, the same set of values for all classifiers may
be sub-optimal in some situations. On the other hand, opti-
mizing very well each classifier individually does not ensure
that they will perform well together.
A better way for model selection would be allowing the
binary SVMs to have different values for their hyperparam-
eters while also considering the classification performance
of the whole multiclass problem. Unfortunately, the number
of possible combinations of hyperparameters values in this
case is large and the computational burden of the model se-
lection would be too high, since (
h
i=1
d
i
)
n
should be trained,
where n is the number of SVMs of the decomposition, h is
the number of hyperparameters to be tuned for each SVM
and d
i
is the number of values the hyperparameter d can
take. The use of heuristic optimization methods may avoid
an exhaustive search throughout the hyperparameters space.
In this paper, the authors apply the Particle Swarm Op-
timization algorithm (PSO) [13] to efficiently deal with the
model selection problem for multiclass SVMs with the RBF
kernel. The method is able to efficiently tune multiple SVM
hyperparameters in simultaneous local and global fashions,
i.e., the parameters of the classification algorithm are opti-
mized considering both the individual components and the
interactions between the parts.
The paper is organized as follows. Section 2 briefly in-
troduces SVMs for binary and multiclass cases. Besides,
a method for estimating SVM generalization error is re-
viewed. Section 3 presents relevant work on SVM model
selection and some of the drawbacks are discussed. Sec-
tion 4 describes the proposed PSO-based method for model
selection. Section 5 shows the results of the experimental
study. Finally, Section 6 draws some conclusions.
2 Support Vector Machines
2.1 Binary SVMs
SVMs constitute a new class of learning algorithms that
have exhibited good performance on a large range of ap-
plications [19]. In the simplest case of binary classifica-
tion, they work by constructing a hyperplane that maxi-
mizes the margin of separation between the examples of the
two classes. By doing so, they implement the principle of
Structural Risk Minimization, which deals with a tradeoff
between the empirical risk (commonly referred as training
error) and the classifier complexity, in order to minimize a
theoretical bound on the generalization error of the classi-
fier. A comprehensive introduction to SVMs is in [19].
Proceedings of the Sixth International Conference on Hybrid Intelligent Systems (HIS'06)
0-7695-2662-4/06 $20.00 © 2006