Unbiased Estimation of the Gradient of the Log-Likelihood for a Class of Continuous-Time State-Space Models BY MARCO BALLESIO & AJAY JASRA Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955, KSA. E-mail: marco.ballesio@kaust.edu.sa, ajay.jasra@kaust.edu.sa Abstract In this paper, we consider static parameter estimation for a class of continuous-time state-space models. Our goal is to obtain an unbiased estimate of the gradient of the log-likelihood (score function), which is an estimate that is unbiased even if the stochastic processes involved in the model must be discretized in time. To achieve this goal, we apply a doubly randomized scheme (see, e.g., [13, 14]), that involves a novel coupled conditional particle filter (CCPF) on the second level of randomization [15]. Our novel estimate helps facilitate the application of gradient-based estimation algorithms, such as stochastic-gradient Langevin descent. We illustrate our methodology in the context of stochastic gradient descent (SGD) in several numerical examples and compare with the Rhee & Glynn estimator [22, 23]. Keywords: Score Function, Particle Filter, Coupled Conditional Particle Filter. 1 Introduction State-space models are used in many applications in applied mathematics, statistics, and economics (see, e.g., [10]). They typically comprise a hidden or unobserved Markov chain that is associated with an observation process. In many cases of practical interest, there are unknown finite-dimensional parameters, θ Θ R d θ , that characterize the dynamics of the hidden and observed processes. The objective of this paper is to consider the estimation of these parameters on the basis of a fixed-length dataset, when the observations and hidden process are both diffusion processes. There are many challenges in parameter estimation for the class of continuous-time state-space models under consideration. The first challenge is that in practice, data are not observed in continuous time; thus, it is necessary to perform time-discretization (e.g., Euler-Maruyama method) of the observation process at the very least. The second challenge is that the hidden diffusion process may often be unavailable (e.g., for exact simulation) without also using time discretization. The third challenge is that even under the aforementioned approximations, to compute the log-likelihood function or its gradient with respect to θ (the score function), which is the estimation paradigm that is followed in this paper, it is still not possible to compute these quantities analytically. We proceed under the assumption that one must time-discretize both the observation and hidden process and that one seeks the parameters that maximize the log-likelihood function (the result of which is the maximum likelihood estimator (MLE)). We use a particular identity for the score function that is provided in [9] and based on the Girsanov change of measure. Alternative identities are discussed in [3] but are not considered in this paper. Given the problem under study, there exist several mechanisms for computing the MLE; however, but we restrict ourselves to gradient-based algorithms, that is, iterative algorithms that compute estimates of θ using the score function. Then, the objective is to estimate the score function for any given θ. We remark, however, that to ensure convergence of the gradient algorithm, it is often preferable to produce an unbiased stochastic estimate of the score. It is well known that ensuring the convergence of stochastic gradient methods is simpler when the estimate of the gradient is unbiased (see, e.g., [2]). In the context of state-space models in discrete and continuous time, there already exists substantial literature on score estimation (see, e.g., [3, 7, 8, 21]). Most of these techniques are based on sequential Monte Carlo (SMC) algorithms (see [11] for an introduction), which are simulation-based methods that use a collection of N 1 samples generated in parallel and sequentially in time. For the problem of interest, when these algorithms can be applied, they produce consistent estimates of the score function (in terms of the number of samples N ), but they will typically introduce a bias with respect to the time discretization. The aim of this paper is to address this problem. Intrinsically, the problem of unbiased estimation of the score function can be placed within the context of exact estimation of the (ratios of) expectations with respect to diffusion processes. The topic of unbiased estimation of the expectation associated with diffusion processes has received considerable attention in recent years. The approaches can be roughly divided into two distinct categories: one that focuses on exact simulation of the diffusion of interest [4, 5] (see also [6]), and another that is based on randomization schemes [20, 22]. The first class of methodologies is based on an elegant paradigm constructing unbiased estimators using the underlying properties of the diffusion process. Due to its nature, however, this class of methodologies cannot be applied for every diffusion 1 arXiv:2105.11522v2 [stat.ML] 28 May 2021