ORIGINAL RESEARCH ARTICLE published: 01 May 2014 doi: 10.3389/fnins.2014.00094 Fusion of electroencephalographic dynamics and musical contents for estimating emotional responses in music listening Yuan-Pin Lin 1,2 *, Yi-Hsuan Yang 3 and Tzyy-Ping Jung 1,2 1 Swartz Center for Computational Neuroscience, Institute for Neural Computation, University of California, San Diego, La Jolla, CA, USA 2 Center for Advanced Neurological Engineering, Institute of Engineering in Medicine, University of California, San Diego, La Jolla, CA, USA 3 Music and Audio Computing Lab, Research Center for IT Innovation, Academia Sinica, Taipei, Taiwan Edited by: Jan B. F. Van Erp, Toegepast Natuurwetenschappelijk Onderzoek, Netherlands Reviewed by: Kenji Kansaku, Research Institute of National Rehabilitation Center for Persons with Disabilities, Japan Dezhong Yao, University of Electronic Science and Technology of China, China *Correspondence: Yuan-Pin Lin, Swartz Center for Computational Neuroscience, Institute for Neural Computation, University of California, San Diego, 9500 Gilman Drive, Mail code 0559, La Jolla, CA 92093-0559, USA e-mail: yplin@sccn.ucsd.edu Electroencephalography (EEG)-based emotion classification during music listening has gained increasing attention nowadays due to its promise of potential applications such as musical affective brain-computer interface (ABCI), neuromarketing, music therapy, and implicit multimedia tagging and triggering. However, music is an ecologically valid and complex stimulus that conveys certain emotions to listeners through compositions of musical elements. Using solely EEG signals to distinguish emotions remained challenging. This study aimed to assess the applicability of a multimodal approach by leveraging the EEG dynamics and acoustic characteristics of musical contents for the classification of emotional valence and arousal. To this end, this study adopted machine-learning methods to systematically elucidate the roles of the EEG and music modalities in the emotion modeling. The empirical results suggested that when whole-head EEG signals were available, the inclusion of musical contents did not improve the classification performance. The obtained performance of 7476% using solely EEG modality was statistically comparable to that using the multimodality approach. However, if EEG dynamics were only available from a small set of electrodes (likely the case in real-life applications), the music modality would play a complementary role and augment the EEG results from around 61–67% in valence classification and from around 58–67% in arousal classification. The musical timber appeared to replace less-discriminative EEG features and led to improvements in both valence and arousal classification, whereas musical loudness was contributed specifically to the arousal classification. The present study not only provided principles for constructing an EEG-based multimodal approach, but also revealed the fundamental insights into the interplay of the brain activity and musical contents in emotion modeling. Keywords: EEG, emotion classification, affective brain-computer interface, music signal processing, music listening INTRODUCTION Through monitoring ongoing electrical brain activity, electroen- cephalography (EEG)-based brain-computer interfaces (BCIs) allow users to voluntarily translate their intentions into com- mands to communicate with or control external devices and environments, instead of using conventional communication channels, e.g., speech and muscles (Millan et al., 2010). Several types of EEG signatures are theoretically defined and empiri- cally proved to be robust in actively and reactively actuating BCIs (Zander and Kothe, 2011), such as evoked potentials, event- related potential (ERP), and sensorimotor rhythms (Wolpaw et al., 2002). Nowadays, a new categorization called passive BCI was introduced (Zander and Kothe, 2011). It enables users to involuntarily interact with machines by means of implicit user states, e.g., emotion. Researches are attempting to augment BCI’s ability with emotional awareness and intelligence in response to users’ emotional states, so called affective brain-computer interfaces (ABCIs). Emotion is a psycho-physiological process as well as a natural communication channel of human beings. Music is considered as an extraordinary mediator to evoke emotions and concur- rently modulate underlying neurophysiological processes (Blood et al., 1999). Upon profound findings in musical emotions, using machine-learning methods to characterize spatio-spectral EEG dynamics associated with emotions has gained increasing atten- tions in the last decade, namely EEG-based emotion classification, due to its promise of potential applications such as musical ABCI (Makeig et al., 2011), neuromarketing (Lee et al., 2007), music therapy (Thaut et al., 2009), implicit multimedia tagging (Soleymani et al., 2012a; Koelstra and Patras, 2013) and triggering (Wu et al., 2008). Given diverse EEG patterns, the major efforts in the previous EEG-based emotion classification works (not limited to music stimuli) were to seek an optimal emotion-aware model by leveraging feature extraction, selection and classification meth- ods (Ishino and Hagiwara, 2003; Takahashi, 2004; Chanel et al., 2009; Frantzidis et al., 2010; Lin et al., 2010b; Petrantonakis www.frontiersin.org May 2014 | Volume 8 | Article 94 | 1