Fitting the smallest enclosing Bregman balls Richard Nock 1 and Frank Nielsen 2 1 Universit´ e Antilles-Guyane rnock@martinique.univ-ag.fr 2 Sony Computer Science Laboratories, Inc. Frank.Nielsen@acm.org Abstract. Finding a point which minimizes the maximal distortion with respect to a dataset is an important estimation problem that has recently received growing attentions in machine learning, with the advent of one class classification. In this paper, we study the problem from a general standpoint, and suppose that the distortion is a Bregman diver- gence, without restriction. Applications of this formulation can be found in machine learning, statistics, signal processing and computational ge- ometry. We propose two theoretically founded generalizations of a popu- lar smallest enclosing ball approximation algorithm for Euclidean spaces coined by B˘ adoiu and Clarkson in 2002. Experiments clearly display the advantages of being able to tune the divergence depending on the data’s domain. As an additional result, we unveil an useful bijection between Bregman divergences and a family of popular averages that includes the arithmetic, geometric, harmonic and power means. 1 Introduction Consider the following problem: given a set of observed data S , compute some ac- curate set of parameters, or simplified descriptions, that summarize (“fit well”) S according to some criteria. This problem is well known in various fields of statistics and computer science. In many cases, it admits two different formula- tions: (1.) Find a point c which minimizes an average distortion with respect to S . (2.) Find a point c which minimizes a maximal distortion with respect to S . These two problems are cornerstones of different subfields of applied mathemat- ics and computer science, such as (i) parametric estimation and the computation of exhaustive statistics for broad classes of distributions in statistics, (ii) one class classification and clustering in machine learning, (iii) the one center problem and its generalizations in computational geometry, among others [1, 2, 5, 9]. The main unknown in both problems is what we mean by distortion. Intu- itively, for any two elements of S , it should be lower-bounded, attain its min- imum when they represent the same element, and it should otherwise give an accurate real-valued appreciation of the way they actually “differ”. Maybe the most prominent example is the squared Euclidean distance (abbreviated L 2 2 for short) for real-valued vectors, which is the componentwise sum of the squared differences. It is certainly the most commonly used distortion measure in com- putational geometry, and one of the most favored in machine learning (support