Abstract—We have recently introduced an incremental
learning algorithm, Learn
++
.NSE, designed to learn in nonsta-
tionary environments, and has been shown to provide an at-
tractive solution to a number of concept drift problems under
different drift scenarios. However, Learn
++
.NSE relies on error
to weigh the classifiers in the ensemble on the most recent data.
For balanced class distributions, this approach works very well,
but when faced with imbalanced data, error is no longer an
acceptable measure of performance. On the other hand, the
well-established SMOTE algorithm can address the class im-
balance issue, however, it cannot learn in nonstationary envi-
ronments. While there is some literature available for learning
in nonstationary environments and imbalanced data separately,
the combined problem of learning from imbalanced data com-
ing from nonstationary environments is underexplored. There-
fore, in this work we propose two modified frameworks for an
algorithm that can be used to incrementally learn from imba-
lanced data coming from a nonstationary environment.
Index Terms—concept drift, imbalanced data, ensemble of
classifiers, incremental learning in nonstationary environments
I. INTRODUCTION
ONCEPT drift, associated with learning in nonstationary
environments, receives substantially less attention in
most classical machine learning literature, particularly if
such an environment generates imbalanced class distribu-
tions. Concept drift can be defined as a change in the under-
lying distribution that generates the data used to train a clas-
sifier. The problem is that classifiers trained on previously
available data may become obsolete. While learning in non-
stationary environments and class imbalance has been re-
searched independently and several novel algorithms have
been proposed to handle nonstationary concepts or imba-
lanced data, there has been relatively little work done with
the combination of these problems [1-3]. Learning in a non-
stationary environment requires that the learner is able to
learn from a concept that is changing in time. This change
can be real or virtual. Real drift is a change in the likelihoods
while a virtual drift is a result of an incomplete representa-
tion of the true distribution in the current data. Real and vir-
tual drift may occur at the same time, and it can be difficult
to determine which one is occurring and even more difficult
to tell if both are occurring at the same time [4].
The main contribution of this work is a supervised ensem-
Manuscript received January 31, 2010. The manuscript was revised and
resubmitted on May 2, 2010. This work was supported by the National
Science Foundation under Grant No: ECCS-0926159.
Authors are with the ECE Department at Rowan University and are part
of the Signal Processing & Pattern Recognition Lab, Glassboro, NJ, 08028,
USA (e-mail: gditzler@ieee.org , polikar@rowan.edu ).
ble of classifiers based incremental learning algorithm that is
designed to work in nonstationary environments, experienc-
ing class imbalance in the data. This framework is based on
the Learn
++
.NSE algorithm; however, the error of each clas-
sifier is no longer the contributing factor to the weighting
scheme. Following a review of approaches for imbalanced
data, nonstationary learning and a combination of the two in
described in Section II, we describe the algorithm in Section
III, followed by the results on several databases subject to
concept drift and class imbalance presented in Section IV.
Finally, Section V contains conclusions and final remarks.
II. BACKGROUND
A. Nonstationary Environments
One of the earliest approaches for classifying data in a
nonstationary environment uses a sliding window, whose
size is determined by the rate of drift. Therefore, an algo-
rithm that uses an adjustable window typically follows an
active approach to drift detection, constantly seeking to
detect change as presented in [5-9]. Typically, in such algo-
rithms, there is a drift detection mechanism that updates the
current model only when the drift is detected, assuming that
the old model (and hence the old data) is no longer relevant.
The faster the drift rate, the shorter the window length, with
the understanding that older data are becoming increasingly
less relevant as the environment is changing. Conversely, the
window size grows if the drift is slow or nonexistent with
the understanding that the data from several time steps ago
may still be relevant and useful for classification purposes.
The FLORA family of algorithms was one of the first me-
thods that employed the dynamic window length approach
[8]. While this approach is very simple, it does not allow for
incremental learning, since incremental learning requires
learning the knowledge from the current data and existing
model(s), without requiring access to previous data. A pas-
sive approach to learning concept drift, on the other hand,
simply accepts that a concept drift may or may not have oc-
curred, and updates the model with each incoming batch of
the data stream. The algorithms proposed in [1;3;10-12] are
all passive algorithms.
Multiple-classifier systems (MCS), or ensembles, have
been suggested as an attractive method of learning concept
drift in [13], based on their natural ability to obtain a good
balance between stability (ability to retain relevant informa-
tion) and plasticity (ability to acquire new knowledge) [14].
Kolter & Maloof present the dynamic weighted majority
(DWM) algorithm in [1] which uses an online learner such
An Ensemble Based Incremental Learning Framework for
Concept Drift and Class Imbalance
Gregory Ditzler, Member, IEEE and Robi Polikar, Senior Member, IEEE
C
978-1-4244-8126-2/10/$26.00 ©2010 IEEE