Comparing Features for Acoustic Anger Classification in German and English IVR Portals Tim Polzehl 1 , Alexander Schmitt 2 , and Florian Metze 3 1 Quality and Usability Lab der Technischen Universit¨ at Berlin / Deutsche Telekom Laboratories Ernst-Reuter-Platz 7, D-10587 Berlin, Germany tim.polzehl@telekom.de 2 Dialogue Systems Group / Institute of Information Technology, University of Ulm Albert-Einstein-Allee 43, D-89081 Ulm alexander.schmitt@uni-ulm.de 3 Language Technologies Institute, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, U.S.A. fmetze@cs.cmu.edu Abstract. Acoustic anger detection in voice portals can help to enhance human computer interaction. In this paper we report about the perfor- mance of selected acoustic features for anger classification. We evalu- ate the performance of the features on both a German and an Ameri- can English dialogue voice portal database which contain “real” speech, i.e. non-acted, continuous speech of narrow-band quality. Deploying a large-scale feature extraction we determine the optimal set of features for each language. To obtain the ranking we use an Information-Gain Ratio filter. Analyzing the most promising features we notice a predomi- nance of MFCC and loudness features. However, for the English database also pitch features proved importance. We further calculate classification scores for our setups using discriminative training and Support-Vector Machine classification. The developed systems show that Emotion Recog- nition in both English and German language can be processed very sim- ilarily. 1 Introduction Detecting emotions in Human Computer Interactive communication is gaining more and more attention in the speech research community. Moreover, classi- fying human emotions by means of automated speech understanding analysis is gaining performance figures to a level that makes it applicable not only for basic research but also opens up opportunities in deployment systems. Emotion detection in Interactive Voice Response (IVR) Dialogue systems can be used to monitor quality of service or to adapt emphatic dialogue strategies [19, 17]. Especially anger detection can deliver useful information to both the customer and the carrier of IVR platforms. It indicates potentially problematic turns or slots to the carrier so he can monitor and refine the system. It can further serve