CLASSIFICATION BY LINEARITY ASSUMPTION A. Majumdar & A. Bhattacharya Department of Electrical and Computer Engineering, University of British Columbia PricewaterhouseCoopers India Pvt Ltd ABSTRACT Recently a classifier was proposed that was based on the assumption: the training samples for a particular class form a linear basis for any new test sample. This assumption is a generalization of the Nearest Neighbour classifier. In the previous work, the classifier was built upon this assumption required solving a complex optimisation problem. The optimisation method was time consuming and restrictive in application. In this work our proposed algorithm takes care of the previous problems keeping the basic assumption intact. We also offer generalisations of the basic assumption. Comparative experimental results on some UCI machine learning databases show that our proposed generalised classifier is performs as good as other well known techniques like Nearest Neighbour and Support Vector Machine. Index Terms— classification 1. INTRODUCTION In this paper we will propose a classification algorithm that develops on the basic idea of NN. The main difference between NN and our approach is that the final outcome does not depend on errors from individual training samples (like NN) but on the combined error from all the all the training samples of a particular class. Our assumption is: training samples for a particular class forms a linear basis for any new test sample. In a recent work [1] this assumption led to a classification problem which required a sparse optimisation problem via l 1 minimisation. This optimization forms the main workhorse of their classification approach and hence the problems associated with this optimisation creeps into their algorithm. We will discuss these problems in greater details at a latter section and propose classification methods that surpass these problems. The rest of the paper will be divided into certain sections. The following section contains the proposed idea and some works done in this area. Section 3 will have the implementation of our different classification algorithms. In section 4 experimental results are shown. Finally in section 5 the conclusions of this work are discussed. 2. LINEARITY ASSUMPTION First we define the problem formally. suppose there are C classes, and in each class there are ni training samples. Further, each sample is represented by a d dimensional vector v. The samples are indexed as vi,j where 'i' denotes the class of the sample and 'j' is the sample number in the class. In this setting, NN classification (with Euclidean distance) is formally written as Class  v test = arg min i ∥v test −v i,j ∥ 2 (1) The basic assumption of NN is v test =v i,j  (2) Assign vtest to the class having minimum error. NN assumes that the test sample can be represented by a single training sample and the distance between them is the least. So the NN algorithm calculates the distance between the test sample and each training sample. It assigns the test sample to the class for which the distance to the training sample had been the least. We generalise this assumption (2): the test sample can be represented as a linear combination of the training samples of the class to which it belongs. Writing formally, assuming that the test sample belongs to the k th class v test = 1 v k, 1  2 v k, 2 ... n k v k ,n k  (3) There are two generalisations of the NN assumption: a) It relies on the entire class rather than a single sample, and b) it considers the fact that the test sample may be scaled combinations of the training samples. 'Linearity assumptions' in the title of the paper refers to the assumption (3). In a recent work a classification scheme was built upon this assumption [1]. In [1] a matrix V of dimension dxN where N = ∑ni, is created. V =[ v 1,1 ∣v 1,2 ∣ ... ∣v 1, n 1 ∣ ... ∣v C, 1 ∣ v C, 2 ∣... ∣ v C,n C ] So assumption (3) can be written in terms of V as v test =V  (4) where α is a sparse vector only having entries corresponding to columns of the k th class in V. In almost all cases (4) is an under-determined system and requires some constraint. In this case the constraint is that the solution must be sparse.