ORIGINAL PAPER EmoTales: creating a corpus of folk tales with emotional annotations Virginia Francisco Raquel Herva ´s Federico Peinado Pablo Gerva ´s Published online: 23 February 2011 Ó Springer Science+Business Media B.V. 2011 Abstract Emotions are inherent to any human activity, including human– computer interactions, and that is the reason why recognizing emotions expressed in natural language is becoming a key feature for the design of more natural user interfaces. In order to obtain useful corpora for this purpose, the manual classifi- cation of texts according to their emotional content has been the technique most commonly used by the research community. The use of corpora is widespread in Natural Language Processing, and the existing corpora annotated with emotions support the development, training and evaluation of systems using this type of data. In this paper we present the development of an annotated corpus oriented to the narrative domain, called EmoTales, which uses two different approaches to repre- sent emotional states: emotional categories and emotional dimensions. The corpus consists of a collection of 1,389 English sentences from 18 different folk tales, annotated by 36 different people. Our model of the corpus development process includes a post-processing stage performed after the annotation of the corpus, in which a reference value for each sentence was chosen by taking into account the tags assigned by annotators and some general knowledge about emotions, which is codified in an ontology. The whole process is presented in detail, and revels significant results regarding the corpus such as inter-annotator agreement, while discussing topics such as how human annotators deal with emotional content when V. Francisco (&) R. Herva ´s F. Peinado P. Gerva ´s Departamento de Ingenierı ´a del Software e Inteligencia Artificial, Facultad de Informa ´tica, Universidad Complutense de Madrid, Madrid, Spain e-mail: virginia@fdi.ucm.es R. Herva ´s e-mail: raquelhb@fdi.ucm.es F. Peinado e-mail: email@federicopeinado.com P. Gerva ´s e-mail: pgervas@sip.ucm.es 123 Lang Resources & Evaluation (2012) 46:341–381 DOI 10.1007/s10579-011-9140-5