A Simple Stochastic Gradient Variational Bayes for the Correlated Topic Model Tomonari Masada 1 and Atsuhiro Takasu 2 1 Nagasaki University, 1-14 Bunkyo-machi, Nagasaki, Japan masada@nagasaki-u.ac.jp 2 National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo, Japan takasu@nii.ac.jp Abstract. This paper proposes a new inference for the correlated topic model (CTM) [6, 5], which is an extension of LDA [7] for modeling cor- relations among latent topics. The proposed inference can be regarded as an instance of the stochastic gradient variational Bayes (SGVB) [14, 20]. By constructing the inference network with the diagonal logistic nor- mal distribution, a simple inference is achieved. Especially, there is no need to invert the covariance matrix explicitly. The variational Bayes inference given in the original paper [6] was evaluated in terms of pre- dictive perplexity by comparing it with that of LDA. We also performed a comparison with LDA in terms of predictive perplexity. However, the following two inference methods for LDA are considered: the collapsed Gibbs sampling (CGS) [10] and the collapsed variational Bayes with a zero-order Taylor expansion approximation (CVB0) [2], where the latter could not be considered in the original paper. While CVB0 for LDA gave the best result in all settings of our experiment, the proposed inference achieved perplexities comparable with those of CGS for LDA. 1 Introduction Topic modeling is one of the outstanding text mining techniques that are based on unsupervised machine learning and have a wide variety of applications. In recent days, even the researchers in the ﬁelds of social science have got interested in topic modeling [11, 12]. After the proposal of LDA [7], many extensions of it are provided by considering more realistic modeling of latent topics and/or of word co-occurrence patterns. Especially, LDA cannot model rich patterns of correlation among latent topics, because the Dirichlet distribution is used as the prior. Therefore, the correlated topic model (CTM) was proposed [6, 5]. One problem with respect to CTM is that the posterior inference is a bit complicated due to the fact that the logistic normal distribution [1], used as the prior of per-document topic probability distributions, is not conjugate to the multinomial distribution. Consequently, the variational Bayesian inference proposed in the original paper of CTM [6] provides no closed update formula for a part of the variational posterior parameters and thus adopts a gradient- based optimization for those parameters. Further, the proposed inference uses the