Computational Statistics and Data Analysis 72 (2014) 13–29
Contents lists available at ScienceDirect
Computational Statistics and Data Analysis
journal homepage: www.elsevier.com/locate/csda
Unimodal density estimation using Bernstein polynomials
Bradley C. Turnbull
∗
, Sujit K. Ghosh
Department of Statistics, North Carolina State University, Raleigh, NC 27695, United States
article info
Article history:
Received 7 February 2013
Received in revised form 18 October 2013
Accepted 19 October 2013
Available online 29 October 2013
Keywords:
Bernstein polynomials
Density estimation
Mixture models
Unimodal
abstract
The estimation of probability density functions is one of the fundamental aspects of any
statistical inference. Many data analyses are based on an assumed family of parametric
models, which are known to be unimodal (e.g., exponential family, etc.). Often a histogram
suggests the unimodality of the underlying density function. Parametric assumptions, how-
ever, may not be adequate for many inferential problems. A flexible class of mixture of Beta
densities that are constrained to be unimodal is presented. It is shown that the estimation
of the mixing weights, and the number of mixing components, can be accomplished using
a weighted least squares criteria subject to a set of linear inequality constraints. The mix-
ing weights of the Beta mixture are efficiently computed using quadratic programming
techniques. Three criteria for selecting the number of mixing weights are presented and
compared in a small simulation study. More extensive simulation studies are conducted to
demonstrate the performance of the density estimates in terms of popular functional norms
(e.g., L
p
norms). The true underlying densities are allowed to be unimodal symmetric and
skewed, with finite, infinite or semi-finite supports. A code for an R function is provided
which allows the user to input a data set and returns the estimated density, distribution,
quantile, and random sample generating functions.
© 2013 Elsevier B.V. All rights reserved.
1. Introduction
Statistical inference is typically based on an assumed family of unimodal parametric models. Nonparametric density
estimation is a popular alternative when that parametric assumption is not appropriate for modeling the density of the un-
derlying population. The kernel method, developed by Parzen (1962), is one of the most popular methods of nonparametric
density estimation. It is defined as the weighted average of kernel functions centered at the observed values. This average
is taken with respect to the empirical cumulative distribution function (ECDF), F
n
(·), and is dependent on a smoothing or
bandwidth parameter.
If one believes the underlying population’s density is unimodal, there are two major advantages to including a unimodal-
ity constraint in the density estimate. First, incorporating extra information about the shape of the density should improve
the overall accuracy of the estimate. Second, extraneous modes, which may hinder the usefulness of the density estimate
as a visual aid and exploratory tool, will be eliminated (Wolters, 2012).
1.1. Unimodal density estimation
Silverman (1981) developed a bandwidth test for unimodality stemming from a nonparametric density estimate.
Unfortunately, this test cannot be used to form the basis for a unimodal density estimate. The density estimate constructed
by the test is smoothed in a global manner that is influenced solely by the features of the density located around the mode
∗
Corresponding author. Tel.: +1 9195152528.
E-mail addresses: bcturnbu@ncsu.edu (B.C. Turnbull), sujit.ghosh@ncsu.edu (S.K. Ghosh).
0167-9473/$ – see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.csda.2013.10.021