Metrika (1997) 46:41-57 The Multiresolution Histogram JOACHIM ENGEL PH Ludwigsburg, Reu'teallee 46, Postfach 220, 71602 Ludwigsburg, Germany Abstract: We introduce a new method for locally adaptive histogram construction that doesn't resort to a standard distribution and is easy to implement: the multiresolution histogram. It is based on a L2 analysis of the mean integrated squared error with Haar wavelets and hence can be associated with a multiresolution analysis of the sample space. Key Words and Phrases: histogram; bin size selection; multiresolution analysis; wavelets. 1 Introduction The histogram is the oldest and most widely used nonparametric density esti- mator. It is intuitively very plausible and easy to compute. The histogram requires a partition of the sample space into sets Bk, k = 1,...,m and is defined as 1 f(x) - #{ilx, Bk} (1) nl],( Bk ) for x ~ Bk. Here XI,.-.,X, denotes the data, assumed to be independent observations of a random variable X with unknown density f and n is the sample size. We consider only the case of one-dimensional observations, i.e. Bk c IR, with Bk some real interval whose Lebesgue measure is 2(Bk). The simplest case is an equal bin size histogram. Then the Bk are determined through the choice of an origin x0 and a bin size or cell width h as Bk = [x0 + (k - 1)h, x0 + kh). The above formula (1) then takes the form f(x) = l #{ilXi ~ Ix0 + ( k - 1)h, x0 + kh)} . (2) The shape of the histogram and its quality as estimator of the density f depends decisively on the choice of the bin size h. Is h too large then all 0026 1335/97/46:1/41 57 $2.50 © 1997 Physica-Verlag, Heidelberg