Abstract—In this paper we present and evaluate a novel
algorithm for ensemble creation. The main idea of the
algorithm is to first independently train a fixed number of
neural networks (here ten) and then use genetic programming
to combine these networks into an ensemble. The use of genetic
programming makes it possible to not only consider ensembles
of different sizes, but also to use ensembles as intermediate
building blocks. The final result is therefore more correctly
described as an ensemble of neural network ensembles. The
experiments show that the proposed method, when evaluated
on 22 publicly available data sets, obtains very high accuracy,
clearly outperforming the other methods evaluated. In this
study several micro techniques are used, and we believe that
they all contribute to the increased performance. One such
micro technique, aimed at reducing overtraining, is the
training method, called tombola training, used during genetic
evolution. When using tombola training, training data is
regularly resampled into new parts, called training groups.
Each ensemble is then evaluated on every training group and
the actual fitness is determined solely from the result on the
hardest part.
I. INTRODUCTION
When performing predictive classification, the primary
goal is to obtain high accuracy; i.e. few misclassifications
when the model is applied to novel data. With this in mind,
Artificial Neural Networks (ANNs) is often the technique of
choice if there is no explicit demand for transparent models.
ANNs are known to normally produce very accurate models
on most data sets, and has successfully been used in a
variety of domains.
Within the research community it is, however, also well
known that the use of ANN ensembles often results in even
higher accuracy; see e.g. [1] and [2]. Despite this, the use of
ensembles in applications is still limited. Two possible
reasons for this are insufficient knowledge about the benefits
of using ensembles and limited support in most data mining
tools. In addition, even when ensembles are used, very
simple variants are often preferred. A typical choice would
be to train a fixed number (like five or ten) ANNs with
identical topology, and simply average the output.
U. Johansson is with the School of Business and Informatics, University
of Borås, SE-501 90 Borås, Sweden. (phone: +46 (0)33 – 4354489; email:
ulf.johansson@hb.se).
T. Löfström is with the School of Business and Informatics, University
of Borås, Sweden. (email: tuve.lofstrom@hb.se).
R. König is with the School of Business and Informatics, University of
Borås, Sweden. (email: rikard.konig@hb.se).
L. Niklasson is with the School of Humanities and Informatics,
University of Skövde, Sweden. (email: lars.niklasson@his.se).
In this paper we suggest and evaluate a, rather technical,
novel algorithm for the construction of ANN ensembles,
called GEMS (Genetic Ensemble Member Selection). The
algorithm uses genetic programming to actively search
among possible ensembles, using a pool of ANNs. Although
GEMS has a multitude of parameters, the basic principle is
easy to understand. We do not at this stage claim to be
anywhere near optimal use of the algorithm, so this study
should be seen as a demonstration of its potential. With this
in mind the main purpose is to evaluate GEMS on a large
number of data sets to establish a lower bound for the level
of accuracy to expect.
II. BACKGROUND AND RELATED WORK
Any algorithm aimed at building ensembles must
somehow both train individual models and combine these
into the actual ensemble. Standard techniques like bagging,
introduced by Breiman [3], and boosting, introduced by
Shapire [4], rely on resampling techniques to obtain
different training sets for each of the classifiers.
Bagging repeatedly samples (with replacement) from a
data set according to a uniform probability distribution. Each
bootstrap sample has the same size as the original data and is
used to train one classifier. After training all classifiers a
majority vote is normally used when classifying a novel
instance.
Boosting is an iterative procedure where the distribution
of training examples is adaptively changed to make the
classifiers focus on examples hard to classify. Boosting
assigns a weight to each training example and this weight is
updated depending on whether or not the current classifier
classified the example correctly. Naturally, examples
incorrectly classified have their weights increased, while
those classified correctly have their weights decreased. The
final ensemble is obtained by combining the classifiers from
each iteration. Boosting algorithms typically differ in the
way they update the weights and how predictions from the
base classifiers are combined to give the ensemble
prediction. Both bagging and boosting can be applied to
ANNs, although they are more common when using
decision trees; see e.g. [5].
Another option is to train a number of classifiers
independently (most often using common data) and then
either combine all classifiers or select a subset to form the
actual ensemble.
Building Neural Network Ensembles using Genetic Programming
Ulf Johansson, Tuve Löfström, Rikard König and Lars Niklasson
0-7803-9490-9/06/$20.00/©2006 IEEE
2006 International Joint Conference on Neural Networks
Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada
July 16-21, 2006
1260