Abstract—We have recently introduced an incremental learning algorithm, Learn ++ .NSE, designed to learn in nonsta- tionary environments, and has been shown to provide an at- tractive solution to a number of concept drift problems under different drift scenarios. However, Learn ++ .NSE relies on error to weigh the classifiers in the ensemble on the most recent data. For balanced class distributions, this approach works very well, but when faced with imbalanced data, error is no longer an acceptable measure of performance. On the other hand, the well-established SMOTE algorithm can address the class im- balance issue, however, it cannot learn in nonstationary envi- ronments. While there is some literature available for learning in nonstationary environments and imbalanced data separately, the combined problem of learning from imbalanced data com- ing from nonstationary environments is underexplored. There- fore, in this work we propose two modified frameworks for an algorithm that can be used to incrementally learn from imba- lanced data coming from a nonstationary environment. Index Terms—concept drift, imbalanced data, ensemble of classifiers, incremental learning in nonstationary environments I. INTRODUCTION ONCEPT drift, associated with learning in nonstationary environments, receives substantially less attention in most classical machine learning literature, particularly if such an environment generates imbalanced class distribu- tions. Concept drift can be defined as a change in the under- lying distribution that generates the data used to train a clas- sifier. The problem is that classifiers trained on previously available data may become obsolete. While learning in non- stationary environments and class imbalance has been re- searched independently and several novel algorithms have been proposed to handle nonstationary concepts or imba- lanced data, there has been relatively little work done with the combination of these problems [1-3]. Learning in a non- stationary environment requires that the learner is able to learn from a concept that is changing in time. This change can be real or virtual. Real drift is a change in the likelihoods while a virtual drift is a result of an incomplete representa- tion of the true distribution in the current data. Real and vir- tual drift may occur at the same time, and it can be difficult to determine which one is occurring and even more difficult to tell if both are occurring at the same time [4]. The main contribution of this work is a supervised ensem- Manuscript received January 31, 2010. The manuscript was revised and resubmitted on May 2, 2010. This work was supported by the National Science Foundation under Grant No: ECCS-0926159. Authors are with the ECE Department at Rowan University and are part of the Signal Processing & Pattern Recognition Lab, Glassboro, NJ, 08028, USA (e-mail: gditzler@ieee.org , polikar@rowan.edu ). ble of classifiers based incremental learning algorithm that is designed to work in nonstationary environments, experienc- ing class imbalance in the data. This framework is based on the Learn ++ .NSE algorithm; however, the error of each clas- sifier is no longer the contributing factor to the weighting scheme. Following a review of approaches for imbalanced data, nonstationary learning and a combination of the two in described in Section II, we describe the algorithm in Section III, followed by the results on several databases subject to concept drift and class imbalance presented in Section IV. Finally, Section V contains conclusions and final remarks. II. BACKGROUND A. Nonstationary Environments One of the earliest approaches for classifying data in a nonstationary environment uses a sliding window, whose size is determined by the rate of drift. Therefore, an algo- rithm that uses an adjustable window typically follows an active approach to drift detection, constantly seeking to detect change as presented in [5-9]. Typically, in such algo- rithms, there is a drift detection mechanism that updates the current model only when the drift is detected, assuming that the old model (and hence the old data) is no longer relevant. The faster the drift rate, the shorter the window length, with the understanding that older data are becoming increasingly less relevant as the environment is changing. Conversely, the window size grows if the drift is slow or nonexistent with the understanding that the data from several time steps ago may still be relevant and useful for classification purposes. The FLORA family of algorithms was one of the first me- thods that employed the dynamic window length approach [8]. While this approach is very simple, it does not allow for incremental learning, since incremental learning requires learning the knowledge from the current data and existing model(s), without requiring access to previous data. A pas- sive approach to learning concept drift, on the other hand, simply accepts that a concept drift may or may not have oc- curred, and updates the model with each incoming batch of the data stream. The algorithms proposed in [1;3;10-12] are all passive algorithms. Multiple-classifier systems (MCS), or ensembles, have been suggested as an attractive method of learning concept drift in [13], based on their natural ability to obtain a good balance between stability (ability to retain relevant informa- tion) and plasticity (ability to acquire new knowledge) [14]. Kolter & Maloof present the dynamic weighted majority (DWM) algorithm in [1] which uses an online learner such An Ensemble Based Incremental Learning Framework for Concept Drift and Class Imbalance Gregory Ditzler, Member, IEEE and Robi Polikar, Senior Member, IEEE C 978-1-4244-8126-2/10/$26.00 ©2010 IEEE