EXPECTATION PROPAGATION FOR INFERENCE IN NON-LINEAR DYNAMICAL MODELS WITH POISSONOBSERVATIONS Byron M. Yu 1 , Krishna V. Shenoy 1,2 , Maneesh Sahani 3 1 Dept. of Electrical Engineering, 2 Neurosciences Program, Stanford University, Stanford, CA, USA 3 Gatsby Computational Neuroscience Unit, UCL, London, UK ABSTRACT Neural activity unfolding over time can be modeled using non-linear dynamical systems [1]. As neurons communicate via discrete action potentials, their activity can be character- ized by the numbers of events occurring within short pre- deﬁned time-bins (spike counts). Because the observed data are high-dimensional vectors of non-negative integers, non- linear state estimation from spike counts presents a unique set of challenges. In this paper, we describe why the expectation propagation (EP) framework is particularly well-suited to this problem. We then demonstrate ways to improve the robust- ness and accuracy of Gaussian quadrature-based EP. Com- pared to the unscented Kalman smoother, we ﬁnd that EP- based state estimators provide more accurate state estimates. 1. INTRODUCTION Consider the following dynamical system for modeling neural spike counts: x t | x t-1 ∼N (f (x t-1 ) ,Q) (1a) y i t | x t ∼ Poisson (λ i (x t ) · Δ) , (1b) where x t ∈ R p×1 is the state vector at time t =1,...,T , and y i t ∈{0, 1, 2,...} is the corresponding observed spike count for neuron i =1,...,q taken in a time bin of width Δ. The functions f : R p×1 → R p×1 and λ i : R p×1 → R + are, in general, non-linear. The initial state x 1 is Gaussian- distributed. For notational compactness, the spike counts for all q simultaneously-recorded neurons are assembled into a q × 1 vector y t , whose ith element is y i t . Note that the obser- vations are discrete-valued and that, typically, q ≫ p. Given sequences of observed spike counts from a group of simultaneously-recorded neurons, we would like to estimate both the state x t at each timepoint, and the model parame- ters in (1). This goal can be naturally approached using the Expectation-Maximization algorithm, as in [1]. Here, we fo- cus on the ﬁrst of the two objectives, namely state estimation. In particular, we seek smoothed state estimates, conditioned on all past, present, and future observations (denoted {y} T 1 ). This work was supported by NIH-NINDS-CRCNS-R01, NSF, NDSEGF, Gatsby, CRF, BWF, ONR, Sloan, andWhitaker. The extended Kalman smoother is a common tool for non- linear state estimation; unfortunately, it cannot be directly ap- plied to our problem because the observation noise in (1b) is not additive Gaussian. A possible alternative is the unscented Kalman smoother (UKS) [2, 3], which employs quadrature techniques to approximate multi-dimensional Gaussian inte- grals that are analytically intractable. For smoothing, the UKS requires that the state dynamics be run backwards in time, ei- ther exactly, or approximately using, e.g., a neural network. However, inverting non-linear state dynamics is generally dif- ﬁcult and may not be possible without altering the behavior of the system. Furthermore, the UKS makes Gaussian approxi- mations in the observation space. For discrete-valued obser- vations as in (1b), this approximation may not be appropriate. Another technique for non-linear state estimation was re- cently developed [4, 5, 6] using the expectation propagation (EP) framework [7]. By contrast to the UKS, the EP-based approach i) does not require inverting the state dynamics, ii) makes Gaussian approximations only in the state space and not in the observation space, and iii) allows state estimates to be reﬁned iteratively using multiple forward-backward passes. We generally observe tens to hundreds of neurons simultane- ously, and the number of spikes emitted by a neuron in a sin- gle time bin is most often 0 or 1. Thus, the observations are high-dimensional and distinctly non-Gaussian. In such set- tings, property ii) above is critical. The EP framework requires estimating the moments of the joint state posterior distribution P ( x t-1 , x t |{y} T 1 ) . This can be done using either Gaussian quadrature (GQ-EP) [5] or a modal Gaussian approximation (Laplace-EP) [6]. Whereas Laplace-EP estimates moments based on a local region of the distribution, GQ-EP takes into account more global properties of the distribution. While promising, GQ-EP is known to be sensitive to outlying observations [5] and can only be used with quadrature rules that satisfy certain properties. In the following, we ﬁrst summarize the EP framework for non-linear state estimation. We then show how the sensitivity to outliers in GQ-EP can be overcome. Next, we demonstrate how quadrature rules that are more accurate than existing ones for GQ-EP can be derived. Finally, we compare the state es- timation accuracy of the UKS, GQ-EP, and Laplace-EP tech- niques for the model neural dynamical system (1).