A NOVEL REJECTION SAMPLING SCHEME FOR POSTERIOR PROBABILITY DISTRIBUTIONS Luca Martino and Joaqu´ ın M´ ıguez Department of Signal Theory and Communications, Universidad Carlos III de Madrid. Avda. de la Universidad 30, 28911 Legan´ es, Spain. E-mail: luca@tsc.uc3m.es ABSTRACT Rejection sampling (RS) is a well-known method to draw from ar- bitrary target probability distributions, which has important applica- tions by itself or as a building block for more sophisticated Monte Carlo techniques. The main limitation to the use of RS is the need to find an adequate upper bound for the ratio of the target probability density function (pdf) over the proposal pdf from which the samples are generated. There are no general methods to analytically find this bound, except in the particular case in which the target pdf is log- concave. In this paper we adopt a Bayesian view of the problem and propose a general RS scheme to draw from the posterior pdf of a signal of interest using its prior density as a proposal function. The method enables the analytical calculation of the bound and can be ap- plied to a large class of target densities. We illustrate its use with a simple numerical example. Index TermsRejection sampling; Monte Carlo methods; Monte Carlo integration; Overbounding; Sampling methods. 1. INTRODUCTION Monte Carlo statistical methods are powerful tools for numerical in- ference and optimization [5]. Currently, there exist several classes of MC techniques, including the popular Markov Chain Monte Carlo (MCMC) [2] and particle filtering [1] algorithms, which enjoy nume- rous applications in signal processing. Among these, rejection sampling (RS) [5, Chapter 2] is a classical Monte Carlo method for “universal sampling”. It can be used to ge- nerate samples from a target probability density function (pdf) by dra- wing from a possibly simpler proposal density. The sample is either accepted or rejected by an adequate test of the ratio of the two pdf’s, and it can be proved that accepted samples are actually distributed ac- cording to the target probability distribution. RS can be applied as a tool by itself, in problems where the goal is to approximate integrals with respect to (w.r.t.) the pdf of interest, but more often it is a useful building block for more sophisticated Monte Carlo procedures [3, 4]. An important limitation of RS methods is the need to analytically establish a bound for the ratio of the target and proposal densities. There is a lack of general methods for bound computation, howe- ver. One exception is the adaptive rejection sampling (ARS) method [3, 5] which, given a target density, provides a procedure to obtain a suitable proposal pdf (for which the bound is easy to compute). This procedure is only valid when the target pdf is strictly log-concave, which is not the case in most practical cases. In this paper we adopt a Bayesian view of the problem and propose a general procedure to apply RS when the target pdf is the posterior pdf of a signal of inter- est (SI) given a collection of observations and the proposal density is the prior of the SI. Unlike the ARS technique, our method can handle target pdf’s with several modes (hence not log-concave). The number of available observations can be arbitrary and each observation can be related to the SI through a different nonlinearity. We assume that the observations are contaminated with additive noise, but these random variables need not be identically distributed. The remaining of the paper is organized as follows. We formally describe the signal model in Section 2. Some useful definitions and basic assumptions are introduced in Section 3. In Section 4 we derive the proposed method to find an upper bound of the target pdf in its general form, while Section 5 is devoted to specific cases of practi- cal interest. The proposed technique is exemplified in Section 6 and Section 7 is devoted to the conclusions. 2. MODEL AND PROBLEM STATEMENT Many signal processing problems involve the estimation of an unob- served SI, x R m (vectors are denoted as lower-case bold-face letters all through the paper), from a sequence of related observa- tions. We assume an arbitrary prior probability density function 1 (pdf) for the SI, p(x), and consider n scalar observations, yi R, i =1,...,n, which are obtained through nonlinear transformations of the signal x contaminated with additive noise. Formally, we write y1 = g1(x)+ ξ1,...,yn = gn(x)+ ξn where y =[y1,...,yn] T R n is the vector of available observations, gi : R m R, i = 1,...,n, are nonlinearities and ξi are independent noise variables, possibly with different distributions for each i. We will assume noise pdf’s of the form p(ξi )= ki exp -αi ¯ Vi (ξi ) , ki i > 0, (1) where ki i are real constants and ¯ V (ξi ) is a function, subsequently referred to as marginal potential, with the following properties: (I) it is real and non negative, ¯ Vi : R [0, +); and (II) it is increasing ( d ¯ V i i > 0) for ξi > 0 and decreasing ( d ¯ V i i < 0) for ξi < 0. These conditions imply that ¯ Vi (ξi ) has a unique minimum at ξ i = 0 and, as a consequence, p(ξi ) has only one maximum (mode) at ξ i =0. Since the noise variables are independent, the joint pdf p(ξ12,...,ξn)= n i=1 p(ξn) is easy to construct and we can de- fine a joint potential V (n) : R n [0, +) as V (n) (ξ1,...,ξn) - log [p(ξ1,...,ξn)] = - n i=1 log p(ξn). (2) Substituting (1) into (2) yields V (n) (ξ1,...,ξn)= cn + n i=1 αi ¯ Vi (ξi ) (3) 1 We use p(·) to denote the probability density function (pdf) of a random magnitude, i.e., p(x) denotes the pdf of x and p(y) is the pdf of y, possibly different. The conditional pdf of x given y is written as p(x|y). 2921 978-1-4244-2354-5/09/$25.00 ©2009 IEEE ICASSP 2009