Distilling Importance Sampling Dennis Prangle * School of Mathematics, University of Bristol, UK Cecilia Viscardi Department of Statistics, Computer Science, Applications University of Florence, Italy Abstract Many complicated Bayesian posteriors are difficult to approximate by either sampling or optimisation methods. Therefore we propose a novel approach combining features of both, making use of a flexible parameterised family of densities, typically a normalising flow. We start with a density from this family approximating a highly tempered posterior. This is used as a proposal density in importance sampling to produce a weighted sample from a less tempered posterior. This sample is then used in optimisation to update the parameters of the density, which we view as “distilling” the importance sampling results. We iterate these steps, gradually reducing the tempering, eventually reaching a good approximation to the posterior. We illustrate our method in three challenging examples, on queuing, epidemiology, and inference for stochastic differential equations. These cover applications in both likelihood-based and likelihood-free inference. 1 Introduction Bayesian inference has had great success in recent decades (Green et al., 2015), but remains challenging in models with a complex posterior depen- dence structure e.g. those involving latent variables. Monte Carlo methods * dennis.prangle@bristol.ac.uk 1 arXiv:1910.03632v4 [stat.CO] 1 Apr 2022