Journal of Neuroscience Methods 225 (2014) 13–28 Contents lists available at ScienceDirect Journal of Neuroscience Methods jo ur nal ho me p age: www.elsevier.com/locate/jneumeth Computational Neuroscience Using Tweedie distributions for ﬁtting spike count data Dina Moshitch a , Israel Nelken a,b,∗ a Department of Neurobiology, Silberman Institute of Life Sciences, Hebrew University, Jerusalem, Israel b The Edmond and Lily Safra Center for Brain Sciences, Hebrew University, Jerusalem, Israel h i g h l i g h t s • Variance of the spike counts distributions often depends supra-linearly on the mean. • We used Tweedie distributions, that have this property, to ﬁt spike count data. • We show how to estimate the Tweedie distributions parameters from the data. • Tweedie distributions often ﬁt the data better than Poisson distributions. • Tweedie distributions increase the reliability of tests for stimulus dependence. a r t i c l e i n f o Article history: Received 21 May 2013 Received in revised form 6 January 2014 Accepted 7 January 2014 Keywords: Tweedie distributions Spike count distribution Generalized linear models (GLM) Auditory cortex Transposed stimuli Electrophysiology Extracellular recordings a b s t r a c t Background: The nature of spike count distributions is of great practical concern for the analysis of neural data. These distributions often have a tendency for ‘failures’ and a long tail of large counts, and may show a strong dependence of variance on the mean. Furthermore, spike count distributions often show multiplicative rather than additive effects of covariates. We analyzed the responses of neurons in primary auditory cortex to transposed stimuli as a function of interaural time differences (ITD). In more than half of the cases, the variance of neuronal responses showed a supralinear dependence on the mean spike count. New method: We explored the use of the Tweedie family of distributions, which has a supralinear depend- ence of means on variances. To quantify the effects of ITD on neuronal responses, we used generalized linear models (GLMs), and developed methods for signiﬁcance testing under the Tweedie assumption. Results: We found the Tweedie distribution to be generally a better ﬁt to the data than the Poisson distribution for over-dispersed responses. Comparison with existing methods: Standard analysis of variance wrongly assumes Gaussian distributions with ﬁxed variance and additive effects, but even generalized models under Poisson assumptions may be hampered by the over-dispersion of spike counts. The use of GLMs assuming Tweedie distributions increased the reliability of tests of sensitivity to ITD in our data. Conclusions: When spike count variance depends strongly on the mean, the use of Tweedie distributions for analyzing the data is advised. © 2014 Elsevier B.V. All rights reserved. 1. Introduction The nature of spike count distributions in response to repeated presentations of the same stimulus is of great practical concern for the analysis of neural data. The variability of spike counts, which is usually attributed to the inevitable uncontrolled vari- ables that occur in all neurophysiological experiments, reduces the ∗ Corresponding author at: Department of Neurobiology, The Alexander Silberman Institute of Life Sciences, Edmond J. Safra Campus, Hebrew University, Jerusalem 91904, Israel. Tel.: +972 2 6584229; fax: +972 2 6586077. E-mail addresses: Dina.farkas@mail.huji.ac.il (D. Moshitch), israel@cc.huji.ac.il (I. Nelken). information on stimulus identity carried by the neuronal responses. Spike count data are often assumed to have a Poisson distribution. This assumption is most often tested, if at all, by calculating the ‘Fano factor’ (Buracas et al., 1998), deﬁned as the ratio of the vari- ance to the mean spike count over trials. A perfectly repeatable neural response has a Fano factor equal to zero, and a Poisson dis- tribution has a Fano factor equal to one. In fact, a large number of studies have reported Fano factors that are greater than one (e.g. Heggelund and Albus, 1978 among others), although several excep- tions to this rule showing low variability of the response have also been reported (e.g. DeWeese et al., 2003). Standard statistical tests that are often used for spike counts may be severely hampered by such effects. The standard analysis of variance (ANOVA) tests require spike count distributions to be 0165-0270/$ – see front matter © 2014 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jneumeth.2014.01.004