Notes on spike sorting Joshua Vogelstein and Liam Paninski Department of Statistics and Center for Theoretical Neuroscience Columbia University liam@stat.columbia.edu http://www.stat.columbia.edu/liam July 22, 2008 Abstract main goal: deal with the spike-collision problem in a computationally tractable way. The model Assume single electrode data, to keep the notation manageable. The model: V (t)= I i=1 V i (t)+ e(t), where e(t) is temporally i.i.d. noise (i.e., we assume that the signal has been prewhitened), with a log-concave density; I denotes the number of units visible on the electrode. The signal V i (t)= j V ij (t - t ij ) is the train of spikes contributed by the i-th cell; here t ij denotes the j -th spike time from the i-th cell, and V ij is the j -th spike waveform from the i-th cell. Now our model for the waveforms is V ij (t)= K k=1 a ijk exp(-t/τ k )1(t> 0). Note that t ij here is located at the beginning of the spike, i.e., well before the first time V ij (t) crosses threshold. The coefficients a ijk are drawn from a K-dimensional distribution. For simplicity, assume a ijk and a i j k are independent if (i, j ) =(i ,j ). Note that we do allow dependencies in the k slot; in fact, since the waveform V ij (t) is smooth in time, a ijk should be chosen so that at least the linear constraints V ij (0) = 0 and dV ij /dt| t=0 = 0, which means that p(a ijk ) is supported on a subspace of dimension less than K. That said, again, for no we have no temporal dependencies in the spike waveforms (although this generalization can be built in later). We choose this somewhat nonstandard sum-of-exponentials waveform model to exploit HMM methods (since V ij (t) evolves in a Markovian manner), as described below. 1