Abstract — Support Vector Machine (SVM) is a powerful method of classification, based on kernels, working with large data sets. In almost all cases there are used two distinct parameters that can be modified for obtaining best results. One of these parameters is easy to infer but the second is usually used as being the number of features that are taken into consideration. This seems to be not a good idea for text classification because it was shown that for a great number of features the classification accuracy is quite poor. In this paper we propose a method to correlate those parameters in order to obtain better results and vary only one parameter. We also show that using this method leads in almost all cases to better results. We introduce two formulas to correlate the parameters for polynomial and Gaussian kernels. Keywords — Learning with Kernels, Support Vector Machine, and Text Classification I. INTRODUCTION ocuments are typically represented as vectors of the features space. Each word in the vocabulary represents a dimension of the feature space. The number of occurrences of a word in a document represents the value of the corresponding component in the document’s vector. The native feature space consists of the unique terms that occur into the documents, which can be tens or hundreds of thousands of terms for even a moderate-sized text collection, being a major problem of text categorization. As a method for text classification we use Support Vector Machine technique, which was proved as being efficient for nonlinear separable input data. This is a relatively recent learning method based on kernels [1],[2]. We use this method both in features selection step and in the classification step. Also we studied the influence of the input data representation on kernel parameters correlation. In this paper we present a comparative study of different parameters for two types of kernels, polynomial and Gaussian, and different methods to correlate kernel’s parameters. In almost all articles where the SVM method is used, researchers used two parameters to infer the kernel but usually they modified only one of them. For the second parameter it is not explicitly specified the modification rule. Sometimes this parameter is chosen as the number of features. For some kinds of applications, when the number of features is not so large, this can be a good idea; but for text classification, when the number of features can be a great, this can lead to powerless results. The general process of classifying text data can be D. Morariu is with the Faculty of Engineering, “Lucian Blaga” University of Sibiu, Romania (phone: 40/740/092202; e-mail: daniel.morariu@ulbsibiu.ro ). L. Vintan is with the Faculty of Engineering, “Lucian Blaga” University of Sibiu, Romania (phone: 40/745/927450; e-mail lucian.vintan@ublsibiu.ro ). considered as having four steps. In the first step is done feature extraction from the text file (text mining). In the second step the features are selected. The third step is the learning step (training). The last is the evaluation step, where the classification process is evaluated. In the first step, we have used text mining like an application of data mining techniques to extract the feature vectors that characterizes a document. Starting with a set of d documents and t terms (words belonging to the documents), we can model each document as a vector v in the t dimensional space ℝ t (a feature vector that characterizes the document into the set of documents). In the second step we used SVM technique as a method of feature selection in order to reduce the features space dimension and to select the best features. In [3] SVM feature selection was proved to be the best one. For the input data we have used three types of representations: Binary representation, Nominal representation, and Cornell SMART. In the next step (classification), we have also used Support Vector Machine. A great advantage of this technique is that it can use large input sets. We implemented this classification method for two types of kernels: polynomial kernel and Gaussian kernel (Radial Basis Function - RBF). We tried to find a simplified form of the kernels, without reducing the performance, actually increasing it, using only one, more intuitive parameter. For multi-class classification we chose the well-known method “one class versus the rest” [4]. Thus, considering M classes, we repeated two class classification for each topic (the category where the document is classified) obtaining M decision functions. Section II contains prerequisites for the work that we present in this paper. In section III we present the frame and methodology used for our experiments. Section IV presents the main results of our experiments. Section V debates and concludes on the most important results obtained and proposes some further work. II. SUPPORT VECTOR MACHINE Support Vector Machine (SVM) is a classification technique based on the statistical learning theory [4],[5] that was applied with great success in many challenging non-linear classification problems and was used on large sets of data with big samples. The purpose of the algorithm is to find a hyperplane that optimally splits the training set (a practical idea can be found in [6]). This technique is based on two class classification. There are some methods used for classification in more that two classes. Looking at the two dimensional problem we actually want to find a line that A Better Correlation of the SVM Kernel’s Parameters Daniel I. Morariu, Lucian N. Vintan D