Fitting the smallest enclosing Bregman balls Richard Nock 1 and Frank Nielsen 2 1 Universit´ e Antilles-Guyane rnock@martinique.univ-ag.fr 2 Sony Computer Science Laboratories, Inc. Frank.Nielsen@acm.org Abstract. Finding a point which minimizes the maximal distortion with respect to a dataset is an important estimation problem that has recently received growing attentions in machine learning, with the advent of one class classiﬁcation. In this paper, we study the problem from a general standpoint, and suppose that the distortion is a Bregman diver- gence, without restriction. Applications of this formulation can be found in machine learning, statistics, signal processing and computational ge- ometry. We propose two theoretically founded generalizations of a popu- lar smallest enclosing ball approximation algorithm for Euclidean spaces coined by B˘ adoiu and Clarkson in 2002. Experiments clearly display the advantages of being able to tune the divergence depending on the data’s domain. As an additional result, we unveil an useful bijection between Bregman divergences and a family of popular averages that includes the arithmetic, geometric, harmonic and power means. 1 Introduction Consider the following problem: given a set of observed data S , compute some ac- curate set of parameters, or simpliﬁed descriptions, that summarize (“ﬁt well”) S according to some criteria. This problem is well known in various ﬁelds of statistics and computer science. In many cases, it admits two diﬀerent formula- tions: (1.) Find a point c ∗ which minimizes an average distortion with respect to S . (2.) Find a point c ∗ which minimizes a maximal distortion with respect to S . These two problems are cornerstones of diﬀerent subﬁelds of applied mathemat- ics and computer science, such as (i) parametric estimation and the computation of exhaustive statistics for broad classes of distributions in statistics, (ii) one class classiﬁcation and clustering in machine learning, (iii) the one center problem and its generalizations in computational geometry, among others [1, 2, 5, 9]. The main unknown in both problems is what we mean by distortion. Intu- itively, for any two elements of S , it should be lower-bounded, attain its min- imum when they represent the same element, and it should otherwise give an accurate real-valued appreciation of the way they actually “diﬀer”. Maybe the most prominent example is the squared Euclidean distance (abbreviated L 2 2 for short) for real-valued vectors, which is the componentwise sum of the squared diﬀerences. It is certainly the most commonly used distortion measure in com- putational geometry, and one of the most favored in machine learning (support