Some Properties of the Gaussian Kernel for One Class Learning Paul F. Evangelista 1 , Mark J. Embrechts 2 , and Boleslaw K. Szymanski 2 1 United States Military Academy, West Point, NY 10996 2 Rensselaer Polytechnic Institute, Troy, NY 12180 Abstract. This paper proposes a novel approach for directly tuning the gaussian kernel matrix for one class learning. The popular gaussian kernel includes a free parameter, σ, that requires tuning typically per- formed through validation. The value of this parameter impacts model performance signiﬁcantly. This paper explores an automated method for tuning this kernel based upon a hill climbing optimization of statistics obtained from the kernel matrix. 1 Introduction Kernel based pattern recognition has gained much popularity in the machine learning and data mining communities, largely based upon proven performance and broad applicability. Clustering, anomaly detection, classiﬁcation, regression, and kernel based principal component analysis are just a few of the techniques that use kernels for some type of pattern recognition. The kernel is a critical component of these algorithms - arguably the most important component. The gaussian kernel is a popular and powerful kernel used in pattern recogni- tion. Theoretical statistical properties of this kernel provide potential approaches for the tuning of this kernel and potential directions for future research. Several heuristics which have been employed with this kernel will be introduced and discussed. Assume a given data set X ∈ R N×m . X contains N instances or observations, x 1 , x 2 , ..., x N , where x i ∈ R 1×m . There are m variables to represent each instance i. For every instance there is a label or class, y i ∈ {−1, +1}. Equation 1 illustrates the formula to calculate a gaussian kernel. κ(i, j )= e −x i −x j  2 2σ 2 (1) This kernel requires tuning for the proper value of σ. Manual tuning or brute force search are alternative approaches. An brute force technique could involve stepping through a range of values for σ, perhaps in a gradient ascent optimiza- tion, seeking optimal performance of a model with training data. Regardless of the method utilized to ﬁnd a proper value for σ, this type of model validation is common and necessary when using the gaussian kernel. Although this approach is feasible with supervised learning, it is much more diﬃcult to tune σ for unsu- pervised learning methods. The one-class SVM, originally proposed by Tax and