F. Roli, J. Kittler, and T. Windeatt (Eds.): MCS 2004, LNCS 3077, pp. 52–61, 2004. Springer-Verlag Berlin Heidelberg 2004 Learn++.MT: A New Approach to Incremental Learning Michael Muhlbaier, Apostolos Topalis, and Robi Polikar Rowan University, Electrical and Computer Engineering Department 201 Mullica Hill Rd., Glassboro, NJ 08028, USA {muhl1565,topa4536}@students.rowan.edu, polikar@rowan.edu Abstract. An ensemble of classifiers based algorithm, Learn++, was recently introduced that is capable of incrementally learning new information from data- sets that consecutively become available, even if the new data introduce addi- tional classes that were not formerly seen. The algorithm does not require ac- cess to previously used datasets, yet it is capable of largely retaining the previously acquired knowledge. However, Learn++ suffers from the inherent “out-voting” problem when asked to learn new classes, which causes it to gen- erate an unnecessarily large number of classifiers. This paper proposes a modi- fied version of this algorithm, called Learn++.MT that not only reduces the number of classifiers generated, but also provides performance improvements. The out-voting problem, the new algorithm and its promising results on two benchmark datasets as well as on one real world application are presented. 1 Introduction It is well known that the amount of training data available and how well the data rep- resent the underlying distribution are of paramount importance for an automated clas- sifier’s satisfactory performance. For many applications of practical interest, obtain- ing such adequate and representative data is often expensive, tedious, and time consuming. Consequently, it is not uncommon for the entire data to be obtained in installments, over a period of time. Such scenarios require a classifier to be trained and incrementally updated – as new data become available – where the classifier needs to learn the novel information provided by the new data without forgetting the knowledge previously acquired from the data seen earlier. This raises the so-called stability-plasticity dilemma [1]: a completely stable classifier can retain knowledge, but cannot learn new information, whereas a completely plastic classifier can instantly learn new information, but cannot retain previous knowledge. Many popular classifi- ers, such as the ubiquitous multilayer perceptron (MLP) or the radial basis function networks, are not structurally suitable for incremental learning, since they are “com- pletely stable” classifiers. The approach generally followed for learning from new data involves discarding the existing classifier, combining the old and the new data and training a new classifier from scratch using the aggregate data. This causes the previously learned information to be lost, a phenomenon known as catastrophic for- getting [2]. Furthermore, training with the combined data may not even be feasible, if the previously used data are lost, corrupted, prohibitively large, or otherwise unavail- able.