A Geometric Viewpoint of the Selection of the Regularization Parameter in Some Support Vector Machines Nandyala Hemachandra and Puja Sahu Indian Institute of Technology Bombay, Mumbai, India {nh,puja.sahu}@iitb.ac.in Abstract. The regularization parameter of support vector machines is intended to improve their generalization performance. Since the feasible region of binary class support vector machines with ﬁnite dimensional feature space is a polytope, we note that classiﬁers at vertices of this unbounded polytope correspond to certain ranges of the regularization parameter. This reduces the search for a suitable regularization parame- ter to a search of (ﬁnite number of) vertices of this polytope. We propose an algorithm that identiﬁes neighbouring vertices of a given vertex and thereby identiﬁes the classiﬁers corresponding to the set of vertices of this polytope. A classiﬁer can then be chosen from them based on a suit- able test error criterion. We illustrate our results with an example which demonstrates that this path can be complicated. A portion of the path is sandwiched between two ﬁnite intervals of path, each generated by separate sets of vertices and edges. Keywords: Support vector machines, regularization path, polytopes, neigh- bouring vertices, prediction error, parameter tuning, linear programming. 1 Introduction A classical learning problem is that of binary classiﬁcation wherein the learner is trained on a given data set (training set) and predicts the class of a new data point. Let the n point training set be {(x i ,y i )} n i=1 , where x i ∈ R m is a vector of m features and y i ∈ {−1, +1} is the label of x i ,i ∈{1, ··· ,n}. We consider the class of linear classiﬁers, (w,b), with w ∈ R m and b ∈ R. The classiﬁer predicts the class of data point x as −1 if w · x + b< 0 and predicts the class as +1 otherwise, i.e., the predicted class for x is sign(w · x + b). Such classiﬁers are called linear Support Vector Machines (SVMs). Among ﬁnite dimensional models for binary class prediction, the class of polynomial kernels form an important class. These are quite popular in natural language processing (NLP) because fast linear SVM methods can be applied to the polynomially mapped data and can achieve accuracy close to that of using highly nonlinear kernels [2].