BECAM Tool - A Semi-automatic Tool for Bootstrapping Emotion Corpus Annotation and Management Slim Abdennadher 1 , Mohamed Aly 1 , Dirk B ¨ uhler 2 Wolfgang Minker 2 , Johannes Pittermann 2 1 Department of Computer Science, German University in Cairo, Egypt 2 Institute of Information Technology, University of Ulm, Germany slim.abdennadher@guc.edu.eg, mohamed.abdulazim@student.guc.edu.eg, {dirk.buehler, wolfgang.minker, johannes.pittermann}@uni-ulm.de Abstract Corpus annotation is an important aspect in speech applications where stochastic models need to be trained and evaluated. Mul- timodal corpora are also annotated. Moreover, corpus annota- tion is an essential phase in the construction of emotion recog- nizer engines. Large corpora, as they are essential to construct representative knowledge bases, have been a problem for cor- pus annotators. Time consumed for labeling such corpora is very signiﬁcant. Furthermore, manageability becomes more ar- duous and tedious. In this paper, we propose a semi-automatic tool, called BECAM tool, that will help corpus annotators in managing and annotating large sample emotion corpora. Index Terms: Corpus annotation, emotion recognition, boot- strap 1. Introduction The need for computers getting to “think” and “feel” is an issue which has been in research for many years. People are getting increasingly involved with computers daily, and they are look- ing for a more enriched interaction experience. When it comes to human beings, the fact that we understand each other and react to each other emotions makes our communication and in- teraction process very versatile. On the other hand, computers are still “emotionally challenged” [1]. Many studies have been conducted towards emotion recognition and its integration in computer applications. This integration might change the way a user interacts with the computer and provides the user with a new level of experience [2]. A variety of statistical approaches have been used for emo- tion recognition. Although there are other approaches, most of the work in this area [3, 4] rely on Hidden Markov Models (HMMs), which can be built and manipulated by the Hidden Markov Model ToolKit (HTK) [5]. For that, building a reliable model for emotion recognition requires a corpus with a large sample size. Sample sound ﬁles in a corpora need to be ana- lyzed by the annotator and labeled accordingly to identify emo- tions within the sample. Then the labeled sample ﬁles are used as a knowledge base to train an emotion recognition engine. With an increasing sample size, the process of annotation be- comes more time consuming and the manageabilitiy complexity increases signiﬁcantly. Various approaches have been consid- ered by corpus annotators to reduce the time needed for anno- tation. The bootstrap approach [6] is one of the used methods. The approach is intended to decrease the time needed by manual annotation by automatically generating annotations for unanno- tated samples based on manually annotated samples. Automati- cally annotated samples are then manually corrected and added to the corpus, and the process proceeds iteratively until all sam- ples are annotated. In this paper, we propose a semi-automatic tool based on HTK and a client-server architecture that enables corpus annotators to perform the bootstrapping on emotion cor- pora adding more ﬂexibility, manageability, and reliability. We call the tool BECAM which stands for B ootstrapping E motion C orpus A nnotation and M anagement. According to a survey conducted on existing annotation tools in [7], we came up to the conclusion that the main goal of the existing tools is to support various methodologies for cor- pus annotation. Analyzing the purpose of the tools surveyed, we found out that tools are tools are designed for special or limited purposes. Various methods are used for annotation and various ﬁle and export formats are used among the tools. Ac- cording to our knowledge, we consider our tool different from the existing ones since we do not physically deal with the an- notation process, instead our focus is on automating the process by applying the bootstrap approach. The paper is organized as follows. In Section 2 we intro- duce an example of the data used in emotion recognition as well as the annotation and we describe how the bootstrap process op- erates. In Sections 3 and 4 we describe the design, implemen- tation and usage of the BECAM Tool, and in Sections 5 and 6 we evaluate the tool and discuss our future directions and inten- tions. 2. Processing Emotional Data 2.1. Example of the data and annotation For our emotion recognition experiments the Database of Ger- man Emotional Speech of the Technical University of Berlin [8] has been used. This Database includes six of the basic emo- tions (German terms in brackets), namely anger (Wut), boredom (Langeweile), disgust (Ekel), fear (Angst), happiness (Freude) and sadness (Trauer) along with neutral recordings serving as references. Ten actors, 5 female and 5 male speakers, per- formed ten different everyday-speech utterances such as “The cloth is lying on the fridge” (“Der Lappen liegt auf dem Eiss- chrank”) or “She will hand it in on Wednesday” (“Das wird sie am Mittwoch abgeben”), each utterance in all emotions al- lowing a high comparability across emotions and speakers. As the recordings were made in an anechoic chamber, background noise could be minimized and therefore had no disturbing ef- fect on the experiments. The emotional quality was rated by 20