EmoSonics – Interactive Sound Interfaces for the Externalization of Emotions Thomas Hermann Ambient Intelligence Group CITEC, Bielefeld University Bielefeld, Germany thermann@techfak.uni- bielefeld.de Jiajun Yang Ambient Intelligence Group CITEC, Bielefeld University Bielefeld, Germany jyang@techfak.uni- bielefeld.de Yukie Nagai Graduate School of Engineering, Osaka University Osaka, Japan yukie@ams.eng.osaka- u.ac.jp ABSTRACT This paper presents a novel approach for using sound to externalize emotional states so that they become an object for communication and reﬂection both for the users them- selves and for interaction with other users such as peers, parents or therapists. We present an abstract, vocal, and physiology-based sound synthesis model whose sound space each covers various emotional associations. The key idea in our approach is to use an evolutionary optimization approach to enable users to ﬁnd emotional prototypes which are then in turn fed into a kernel-regression-based mapping to allow users to navigate the sound space via a low-dimensional in- terface, which can be controlled in a playful way via tablet interactions. The method is intended to be used for sup- porting people with autism spectrum disorder. CCS Concepts •Human-centered computing → Auditory feedback; Accessibility technologies; Accessibility systems and tools; •Applied computing → Sound and music computing; Keywords Emotions, Sound, Auditory Display, Autism Spectrum Dis- order (ASD) 1. INTRODUCTION Emotions play an important role for our experience of the world and ourselves as they color experience and inﬂuence our decisions and their execution. Furthermore, the expres- sion of emotions and perception of emotions are highly rel- evant for social interaction. As emotional intelligence is a complex function of the mind, and it operates usually au- tomatically we are rarely aware of how exactly the dynam- ics of emotions operates and manifests. Emotions manifest normally in a number of carriers such as facial expression, movement, prosody in spoken language, or explicitly by the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full cita- tion on the ﬁrst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. Request permissions from permissions@acm.org. AM ’16, October 04-06, 2016, Norrköping, Sweden c  2016 ACM. ISBN 978-1-4503-4822-5/16/10. . . $15.00 DOI: http://dx.doi.org/10.1145/2986416.2986437 choice of spoken words. They furthermore correlate and in- ﬂuence physiological states such as heart rate, body tension, etc. Yet if – for instance due to certain dysfunctions or dis- orders such as Autism Spectrum Disorder (ASD) – the per- ception, processing and manifestation of emotional signals is hindered, it aﬀects the individual and social live signiﬁ- cantly [9]. Failing to express feelings or pain may lead to many negative eﬀects including increasing in anger or even self-injuries [12]. This is risky as people suﬀering from ASD are often not able to seek help using methods such as facial expression. Subsequently, caregivers are not able to spot it for the same reason. A study shows that a person suﬀering from ASD may appear to be completely calm while having an unusually high resting heart rate [4]. With this research we set out to develop a sound-based interface which might to some degree bypass cognitive pro- cessing steps and facilitate the expression of emotions in a direct way. Using sound is motivated from the fact that it is already established and known to be an important carrier of emotional charge, for instance consider ﬁlm music, the expressiveness achieved by prosody, or music in general. By means of the interactive synthesis of emotional sounds, and an iterative reﬁnement of the sounds to match the innerly felt emotions, the user is expected to perform an inner ‘self- reﬂective dialogue’ for which the synthesized sound becomes a mirror image of the sensed state. This may help users (a) for themselves to pay more attention to their emotions, understand and observe them more clearly, (b) to simulta- neously ﬁnd novel ways to express them, i.e. to bring them beyond the surface, which is particularly relevant if emo- tional production is hindered, and (c) to render emotions more ‘tangible’ as a shared resource to be worked with, for instance, as a therapeutical means. ASD patients could proﬁt from such a technique that al- lows them to express emotions towards others (peers, care- givers, parents), or even to train the perception of emotional categories as they manifest in sound in gamelike interactions. In this paper we will ﬁrst review related work at the intersection of sound and emotion, then outline basic as- sumptions about and representations of emotions as a basis for the deﬁnition of continuous sound models that enable the expression of various emotional signals and their inter- polation and morphing. These models necessarily have a large number of parameters, which complicates the adjust- ment towards clear emotional expressions. Therefore we pro- ceed with an approach inspired by evolutionary optimization techniques where the user merely iteratively selects one of