Biomedical Signal Processing and Control 7 (2012) 79–87 Contents lists available at ScienceDirect Biomedical Signal Processing and Control journa l h omepage: www.elsevier.com/locate/bspc Acoustic feature selection and classiﬁcation of emotions in speech using a 3D continuous emotion model Humberto Pérez-Espinosa ∗ , Carlos A. Reyes-García, Luis Villase ˜ nor-Pineda Instituto Nacional de Astrofísica, Óptica y Electrónica, Luis Erique Erro 1, Tonantzintla, Puebla 72840, Mexico a r t i c l e i n f o Article history: Received 15 September 2010 Received in revised form 23 February 2011 Accepted 24 February 2011 Available online 3 April 2011 Keywords: Automatic emotion recognition Continuous emotion model Feature selection a b s t r a c t In this paper we report the results obtained from experiments with a database of emotional speech in English in order to ﬁnd the most important acoustic features to estimate Emotion Primitives which determine the emotional content on speech. We are interested in exploiting the potential beneﬁts of continuous emotion models, so in this paper we demonstrate the feasibility of applying this approach to annotation of emotional speech and we explore ways to take advantage of this kind of annotation to improve the automatic classiﬁcation of basic emotions. © 2011 Elsevier Ltd. All rights reserved. 1. Introduction Emotions are very important in our everyday life, they are present in everything we do. There is a continuous interaction between emotions, behavior and thoughts, in such a way that they constantly inﬂuence each other. Emotions are a great source of information in communication and interaction among people, they are assimilated intuitively. The applications of emotion recognition encompass many ﬁelds, for example, as a supporting technology in medical areas such as psychology, neurology and caring of aged and impaired people. Automatic emotion recognition based on biomedical signals, facial and vocal expressions has been applied to diagnosis and following- up of progressive neurological disorders, speciﬁcally Huntington’s and Parkinson’s diseases [1]. These pathologies are character- ized by a deﬁcit in emotional processing of fear and disgust and, thus, the system could be utilized to determine the subject reac- tion/or absence of reaction to speciﬁc emotions, helping the health professionals to gain a better understanding of these disorders. Furthermore, the system could be used as a reference to evalu- ate the patients’ response to certain medicines. Another important medical application is remote medical support. These kinds of envi- ronments enable communication between medical professionals and patients for cases of regular monitoring and emergency situ- ations. In this scenario the system recognizes patient’s emotional ∗ Corresponding author. E-mail addresses: humbertop@inaoep.mx (H. Pérez-Espinosa), kargaxxi@inaoep.mx (C.A. Reyes-García), villasen@inaoep.mx (L. Villase ˜ nor-Pineda). states and then transmits data indicating the patient is experienc- ing depression or sadness, health-care providers monitoring them will be better prepared to respond. Such a system has the poten- tial to improve patient satisfaction and health [2–4]. It can also be an asset for disabled people who have difﬁculties with communi- cation. Hearing-impaired people who are not profoundly deaf can use residual hearing to communicate with other people and learn to speak with emotions making communication more complete and understandable. In these cases, emotion recognition engines can be used as an important element of a computer-assisted emotional speech training system [5]. For hearing-impaired people, it could provide an easier way to learn how to speak with emotion more nat- urally or help speech therapist to guide them to express correctly emotions in speech. Emotion recognition arouses great interest in the interface design given that recognizing and understanding emotions automatically is one of the key steps towards emotional intelligence in Human–Computer Interaction (HCI). The need for automatic emotion recognition has emerged due to the tendency towards a more natural interaction between humans and comput- ers. Affective computing is a topic within HCI that encompass these research tendency trying to endow computers with the ability to detect, recognize, model and take into account user’s emotional state that plays a role of paramount importance in the way humans make decisions [6]. Emotions are essential for human thought pro- cesses that inﬂuence interactions between people and intelligent systems. In the area of automatic emotion recognition mainly two annotation schemes have been used to capture and describe the emotional content in speech: discrete and continuous approaches. Discrete approach is based on the concept of basic emotions such as anger, joy, and sadness, that are the most intense form of emo- 1746-8094/$ – see front matter © 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.bspc.2011.02.008