A NOVEL REJECTION SAMPLING SCHEME FOR POSTERIOR PROBABILITY
DISTRIBUTIONS
Luca Martino and Joaqu´ ın M´ ıguez
Department of Signal Theory and Communications, Universidad Carlos III de Madrid.
Avda. de la Universidad 30, 28911 Legan´ es, Spain. E-mail: luca@tsc.uc3m.es
ABSTRACT
Rejection sampling (RS) is a well-known method to draw from ar-
bitrary target probability distributions, which has important applica-
tions by itself or as a building block for more sophisticated Monte
Carlo techniques. The main limitation to the use of RS is the need
to find an adequate upper bound for the ratio of the target probability
density function (pdf) over the proposal pdf from which the samples
are generated. There are no general methods to analytically find this
bound, except in the particular case in which the target pdf is log-
concave. In this paper we adopt a Bayesian view of the problem and
propose a general RS scheme to draw from the posterior pdf of a
signal of interest using its prior density as a proposal function. The
method enables the analytical calculation of the bound and can be ap-
plied to a large class of target densities. We illustrate its use with a
simple numerical example.
Index Terms— Rejection sampling; Monte Carlo methods;
Monte Carlo integration; Overbounding; Sampling methods.
1. INTRODUCTION
Monte Carlo statistical methods are powerful tools for numerical in-
ference and optimization [5]. Currently, there exist several classes
of MC techniques, including the popular Markov Chain Monte Carlo
(MCMC) [2] and particle filtering [1] algorithms, which enjoy nume-
rous applications in signal processing.
Among these, rejection sampling (RS) [5, Chapter 2] is a classical
Monte Carlo method for “universal sampling”. It can be used to ge-
nerate samples from a target probability density function (pdf) by dra-
wing from a possibly simpler proposal density. The sample is either
accepted or rejected by an adequate test of the ratio of the two pdf’s,
and it can be proved that accepted samples are actually distributed ac-
cording to the target probability distribution. RS can be applied as a
tool by itself, in problems where the goal is to approximate integrals
with respect to (w.r.t.) the pdf of interest, but more often it is a useful
building block for more sophisticated Monte Carlo procedures [3, 4].
An important limitation of RS methods is the need to analytically
establish a bound for the ratio of the target and proposal densities.
There is a lack of general methods for bound computation, howe-
ver. One exception is the adaptive rejection sampling (ARS) method
[3, 5] which, given a target density, provides a procedure to obtain a
suitable proposal pdf (for which the bound is easy to compute). This
procedure is only valid when the target pdf is strictly log-concave,
which is not the case in most practical cases. In this paper we adopt
a Bayesian view of the problem and propose a general procedure to
apply RS when the target pdf is the posterior pdf of a signal of inter-
est (SI) given a collection of observations and the proposal density is
the prior of the SI. Unlike the ARS technique, our method can handle
target pdf’s with several modes (hence not log-concave). The number
of available observations can be arbitrary and each observation can be
related to the SI through a different nonlinearity. We assume that the
observations are contaminated with additive noise, but these random
variables need not be identically distributed.
The remaining of the paper is organized as follows. We formally
describe the signal model in Section 2. Some useful definitions and
basic assumptions are introduced in Section 3. In Section 4 we derive
the proposed method to find an upper bound of the target pdf in its
general form, while Section 5 is devoted to specific cases of practi-
cal interest. The proposed technique is exemplified in Section 6 and
Section 7 is devoted to the conclusions.
2. MODEL AND PROBLEM STATEMENT
Many signal processing problems involve the estimation of an unob-
served SI, x ∈ R
m
(vectors are denoted as lower-case bold-face
letters all through the paper), from a sequence of related observa-
tions. We assume an arbitrary prior probability density function
1
(pdf) for the SI, p(x), and consider n scalar observations, yi ∈ R,
i =1,...,n, which are obtained through nonlinear transformations
of the signal x contaminated with additive noise. Formally, we write
y1 = g1(x)+ ξ1,...,yn = gn(x)+ ξn where y =[y1,...,yn]
T
∈
R
n
is the vector of available observations, gi : R
m
→ R, i =
1,...,n, are nonlinearities and ξi are independent noise variables,
possibly with different distributions for each i. We will assume noise
pdf’s of the form
p(ξi )= ki exp
-αi
¯
Vi (ξi )
, ki ,αi > 0, (1)
where ki ,αi are real constants and
¯
V (ξi ) is a function, subsequently
referred to as marginal potential, with the following properties: (I) it
is real and non negative,
¯
Vi : R → [0, +∞); and (II) it is increasing
(
d
¯
V
i
dξ
i
> 0) for ξi > 0 and decreasing (
d
¯
V
i
dξ
i
< 0) for ξi < 0.
These conditions imply that
¯
Vi (ξi ) has a unique minimum at ξ
∗
i
=
0 and, as a consequence, p(ξi ) has only one maximum (mode) at
ξ
∗
i
=0. Since the noise variables are independent, the joint pdf
p(ξ1,ξ2,...,ξn)=
n
i=1
p(ξn) is easy to construct and we can de-
fine a joint potential V
(n)
: R
n
→ [0, +∞) as
V
(n)
(ξ1,...,ξn) - log [p(ξ1,...,ξn)] = -
n
i=1
log p(ξn). (2)
Substituting (1) into (2) yields
V
(n)
(ξ1,...,ξn)= cn +
n
i=1
αi
¯
Vi (ξi ) (3)
1
We use p(·) to denote the probability density function (pdf) of a random
magnitude, i.e., p(x) denotes the pdf of x and p(y) is the pdf of y, possibly
different. The conditional pdf of x given y is written as p(x|y).
2921 978-1-4244-2354-5/09/$25.00 ©2009 IEEE ICASSP 2009