Single-Trial Estimation and Timewarped Averaging of EEG-data Matthias Ihrke 1,3 , Hecke Schrobsdorﬀ 1,2 , J. Michael Herrmann 1,2 [1] Bernstein Center for Computational Neuroscience, Bunsenstrasse 10, 37073G¨ottingen, [2] University of G¨ ottingen, Inst. for Nonlinear Dynamics, Bunsenstrasse 10, 37073 G¨ottingen, [3] Georg-Elias-M¨ uller-Institute, University of G¨ ottingen, Waldweg 26 37073 G¨ ottingen. ihrke|hecke|michael@nld.ds.mpg.de göttingen Abstract Averaging of Event-Related Potentials (ERP) is one of the main techniques used for the analysis of electroencephalographic data acquired in neuropsycho- logical experiments. Even though it is a common observation that reaction times in experiments that require higher cognitive functions show a large vari- ance, the problem of incorporating this knowledge into the computation of the average has been largely neglected [2]. Tackling these diﬃculties, we pro- pose a new averaging method based on the estimation of single-trial ERPs combined with a dynamic time warping algorithm. Simulation results showing the performance of the new method compared to alternative approaches are discussed. Introduction Event-Related Potential (ERP) = segment of EEG-data that directly fol- lows an internal or external event (e.g. stimulation) -500 0 500 1000 1500 2000 -8 -6 -4 -2 0 2 4 6 8 -500 0 500 1000 1500 2000 -150 -100 -50 0 50 100 150 200 -500 0 500 1000 1500 2000 -80 -60 -40 -20 0 20 40 60 80 100 120 -500 0 500 1000 1500 2000 -100 -80 -60 -40 -20 0 20 40 60 80 100 . . . μV ms ms ms μV amplitude of signal ≈ 10μV amplitude of noise ≈ 100μV RMSE ≈ 30 measured signal (single trial) Fp1 〈R i 〉 i ± σ R region in which the reaction times probably lie in the single trials average over 132 trials ms μV μV Situation: 1 very noisy data (SNR ≈−20 dB) 2 large variance in reaction times (≈ 200 ms) ⇒ unreliable and statistically very uneﬃcient average (low SNR of single trials) Goal 1 – ﬁnd a reliable estimate of the ERP in a single trial Goal 2 – ﬁnd a way to correct the temporal distortion of ERP in the single trials to improve the average (using reaction time information) Noise Models A noise model typically used in the analysis of EEG-data [4] is s i (t)= u(t)+ ǫ(t) ∀i (1) with ǫ zero-mean and stationary. This is an oversimpliﬁaction (nonlinear in- teractions between the signal and the “noise” can occur). Therefore we postulate the existance of trial-dependant signals u i that are stretched/shrinked and scaled versions of an invariant signal u. The measured signal in this setting is composed as s i (t)= α i u(φ −1 i (t)) + ǫ(t), (2) where φ is a monotonous function that minimizes ‖ u(t) − u i (φ i (t)) ‖ ∗ (3) and ǫ zero-mean and stationary. The challenge is the identiﬁcation of the unknown signal u and the computation of the transformation functions φ i , to correct the average ˆ u(t)= 〈u i 〉 i = 1 N N  i=0 s i (φ i (t)). (4) Single-Trial ERP Estimation Wavelet-based Denoising achieves particularly good results in terms of signal-to-noise ratios even for correlated noise →adequate choice for ERP estimation General Method: If W is the wavelet-transform, 1 apply W to the signal to obtain the wavelet-coeﬃcients w jk , 2 apply a thresholding function η (w, λ) with threshold λ to the coeﬃ- cients to obtain the thresholded coeﬃcients ˜ w jk either level-dependant or universally, 3 apply the inverse wavelet transform on the thresholded coeﬃcients to obtain a “cleaner” signal ˆ u = W −1 ˜ w jk . Parameters: 1 the choice of the threshold λ is crucial ﬁxed [5]: λ j = σ j  2 log e (N log 2 N ) adaptive [1]: λ j = arg min 0≤t≤ √ 2 log d SURE (t, w jk ) SURE (t, w jk )= N − 2 · #{k : |w jk |≤ t} + ∑ N k =1 min(|w jk |,t) 2 2 universal or level-dependent thresholding (per resolution-level) 3 the choice of the thresholding function η (hard or soft) η s (w, λ)= sgn(w )(|w |− λ) + or η h (w, λ)= wI {|w |≥ λ} Simulation Results – Parameter-Scan: (artiﬁcial data) hard soft 0 2 4 6 8 10 Hard- vs. Soft-Thresholding threshold SNR (db) hard soft 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 threshold RMSE (b) (a) 2 4 6 8 10 2 4 6 8 10 L SNR (db) conventional ti sureshrink heursure 2 4 6 8 10 1.5 2.0 2.5 3.0 3.5 4.0 4.5 L RMSE conventional ti sureshrink heursure (a) (b) → level-dependant thresholding is essential → soft thresholding works better than hard thresholding → there is an optimal resolution level L to begin thresholding of the wavelet coeﬃcients → while the adaptive methods are more stable depending on L, the best per- formance is achieved by ﬁxed thresholding Dynamic Time-Warping Deﬁne a pointwise dissimilarity measure [3] between two signals: d(s(x),u(y )) := θ 1 | ˜ s(x) − ˜ u(y )| + θ 2 | ˜ s ′ (x) − ˜ u ′ (y )|. (5) Find a path p i+1 ∈{(j +1,k ), (j, k + 1), (j +1,k + 1)} through the matrix d(s j ,u k ) that minimizes the sum of the d jk by D j,k = d j,k + min {D j,k −1 ,D j −1,k ,D j −1,k −1 }. ⇒ ”Warping” s according to p, minimizes the diﬀerence of the two signals in (3): DTW(u, s)=˚s = arg min s ‖ s − u ‖ d 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 10 15 20 25 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 1 2 3 θ 2 θ 1 θ 2 θ 1 SNR (dB) RMSE (a) SNR depending on θ 1 and θ 2 (b) RMSE depending on θ 1 and θ 2 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 -5 0 5 0 5 10 15 20 25 30 -5 0 5 j k s k u j Results: → warping eﬃciency does not depend on θ 1 and θ 2 as long as both are > 0 → We developed an ap- proach that can deal with signals of diﬀerent length. Conclusion and Outlook Correction of temporal distortion can lead to statistically more eﬃcient usage of the data, more reliable interpretation of late components of the ERP. Further validation and reﬁning of the approach is necessary by further testing in settings with correlated noise, by applying the method to real EEG-data. The principle of correcting the temporal distortion of recorded data can be applied to other neuroimaging techniques. Response-Latency Corrected Averaging Idea: Form an average following the model in (2) →determine φ i per trial that minimizes (3) and try to approach u by an estimation-maximization-like approach 1 estimate the single-trial ERPs u i by wavelet-denoising 2 set 1 N ∑ N i=1 s i as a ﬁrst estimate for u 3 apply DTW(u, s i )=˚ s i for all trials 4 set u = 1 N ∑ N i=1 ˚s i and iterate (3) until convergence -500 0 500 1000 1500 2000 -15 -10 -5 0 5 10 15 20 0 200 400 600 800 1000 1200 -15 -10 -5 0 5 10 15 20 -500 0 500 1000 1500 2000 -15 -10 -5 0 5 10 15 20 u μV ms ˆ u μV 〈s i 〉 i u denoised 〈s i 〉 i denoised 〈s i 〉 i u ˆ u μV ms ms σ R = 80ms σ R = 200ms σ R = 80ms Preliminary Results: (artiﬁcial data) For artiﬁcial signals with SNR ≈−5, the algorithm achieves better results in terms of the SNR than a simple pointwise average, even after denoising σ R (ms) SNR(u, ˆ u) SNR(u, 〈u i 〉) SNR(u, 〈s i 〉) 80 9.885 8.934 – 200 8.322 7.7051 6.8297 Even though there are clearly artifacts in the timewarped average, it approaches the shape of u (red curve) better than the simple average. The parameters of the algorithm must in furhter experiments be tuned to yield a smoother curve than depicted. References [1] David L. Donoho and Iain M. Johnstone. Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association, 90(432):1200–1224, dec 1995. [2] H. Gibbons and J. Stahl. Response-time corrected averaging of event-related potentials. Clinical Neurophysiology, (118):197–208, 2007. [3] T. Picton, M. Hunt, R. Mowrey, R. Rodriguez, and J. Maru. Evaluation of brain-stem auditory evoked potentials using dynamic time warping. Electroencephalogry and Clinical Neurophysiology, 71(3):212–25, 1988. [4] W. A. Truccolo, M. Ding, K. H. Knuth, R. Nakamura, and S. L. Bressler. Trial-to-trial variability of cortical evoked responses: implications for the analysis of functional connectivity. Clinical Neurophysiology, 113(2):206–226, 2002. [5] Zhisong Wang, Alexander Maier, David A. Leopold, Nikos K. Logothetis, and Hualou Liang. Single-trial evoked potential estimation using wavelets. Computers in Biology and Medicine, 37(4):463–473, April 2007.