Multicategory Incremental Proximal Support Vector Classifiers Amund Tveit and Magnus Lie Hetland Department of Computer and Information Science, Norwegian University of Science and Technology, N-7491 Trondheim, Norway {amundt,mlh}@idi.ntnu.no Abstract. Support Vector Machines (SVMs) are an efficient data min- ing approach for classification, clustering and time series analysis. In recent years, a tremendous growth in the amount of data gathered has changed the focus of SVM classifier algorithms from providing accurate results to enabling incremental (and decremental) learning with new data (or unlearning old data) without the need for computationally costly re- training with the old data. In this paper we propose an efficient algorithm for multicategory classification with the incremental proximal SVM in- troduced by Fung and Mangasarian. 1 Introduction Support Vector Machines (SVMs) are an efficient data mining approach for clas- sification, clustering and time series analysis [1–3]. In recent years, a tremendous growth in the amount of data gathered (for example, in e-commerce and intru- sion detection systems) has changed the focus of SVM classifier algorithms from providing accurate results to enabling incremental (and decremental) learning with new data (or unlearning old data) without the need for computationally costly retraining with the old data. Fung and Mangasarian [4] introduced the Incremental and Decremental Linear Proximal Support Vector Machine (PSVM) for binary classification and showed that it could be trained extremely efficiently, with one billion examples (500 increments of two million examples) in two hours and twenty-six minutes on relatively low-end hardware (400MHz Pentium II). In this paper we propose an efficient algorithm based on memoization, in order to support Multicategory Classification for the Incremental PSVM. 2 Background Theory The standard binary SVM classification problem with soft margin (allowing some errors) is shown visually in Fig. 1(a) and as a constrained quadratic programming problem in (1). Intuitively, the problem is to maximize the margin between the solid planes and at the same time permit as few errors as possible, errors being positive class points on the negative side (of the solid line) or vice versa.