A GOOD GESTURE: EXPLORING NONVERBAL COMMUNICATION FOR ROBUST SLDSs Beatriz López Mencía 1 , Álvaro Hernández Trapote 1 , David Díaz Pardo de Vera 1 , Doroteo Torre Toledano 2 , Luis A. Hernández Gómez 1 , Eduardo López Gonzalo 1 1 Grupo de Aplicaciones del Procesado de Señales, E.T.S.I.T, Universidad Politécnica de Madrid 2 Escuela Politécnica Superior, Universidad Autónoma de Madrid ABSTRACT In this paper we propose a research framework to explore the possibilities that state-of-the-art embodied conversational agents (ECAs) technology can offer to overcome typical robustness problems in spoken language dialogue systems (SLDSs), such as error detection and recovery, changes of turn and clarification requests, that occur in many human-machine dialogue situations in real applications. Our goal is to study the effects of nonverbal communication throughout the dialogue, and find out to what extent ECAs can help overcome user frustration in critical situations. In particular, we have created a gestural repertoire that we will test and continue to refine and expand, to fit as closely as possible the users’ expectations and intuitions, and to favour a more efficient and pleasant dialogue flow for the users. We also describe the test environment we have designed, simulating a realistic mobile application, as well as the evaluation methodology for the assessment, in forthcoming tests, of the potential benefits of adding nonverbal communication in complex dialogue situations. 1. INTRODUCTION Spoken language dialogue systems and embodied conversational agents are being introduced in a rapidly increasing number of Human-Computer Interaction (HCI) applications. The technologies involved in SLDSs (speech recognition, dialogue design, etc.) are mature enough to allow the creation of trustworthy applications. However, robustness problems still arise in concrete limited dialogue systems because there are many error sources that may cause the system to perform poorly [1]. At the same time, embodied conversational agents (ECAs) are gaining prominence in HCI systems, since they make for more user-friendly applications while increasing communication effectiveness. There are many studies on the effects –from psychological to efficiency in goal achievement– ECAs have on users of a variety of applications (see [2] and [3]), but still very few (see [4]) on the impact of ECAs in directed dialogue situations where robustness is a problem. We propose looking into the effects of adding an ECA to a concrete spoken dialogue system, and the potential benefits this may have, particularly regarding various difficult dialogue situations already identified by various leading authors in the field ([5] and [6]). This paper outlines the main elements of a research framework we have designed for these purposes. Our research group is paying particular attention to videotelephony applications. We may now consider incorporating ECAs onto the new visual channel. Videotelephony has its own peculiarities which it may be relevant to take into account when developing ECAs for them (for instance, screen space is more limited). 2. HOW ECAS CAN BE USEFUL There are many nonverbal elements of communication in everyday life that are important because they convey a considerable amount of information and qualify the spoken message, sometimes even to the extent that what is meant is actually the opposite of what is said. Showing objects, types of behaviour, mood, reactions, emotions and pointing towards something in the referential context (deictic gestures) are some of the functions of nonverbal language, which carries a great amount of semantic content related to people’s attitudes and intentions in interaction processes (see [7]). ECAs offer the possibility to combine several communication modes such as speech and gestures, making it possible, in theory, to create interfaces with which human- machine interaction is much more natural and comfortable. Despite the fact that we are still a long way from understanding how best to incorporate nonverbal communication to improve human-machine dialogue, ECAs are already being employed to improve interaction (see [8]). These are some situations in which an ECA could have a positive effect: • Efficient turn management: the body language and expressiveness of agents are important not only to reinforce the spoken message, but also, as Cassell points out [2], to regulate the flow of the dialogue. • Improving error recovery: the process of recognition error recovery usually leads to a certain degree of user frustration (see [9]). ECA’s may help reduce frustration, and by doing so make error recovery more effective [10]. Indeed, it is common, once an error occurs, to enter in an error spiral, in which the system is trying to recover, the user gets ever more frustrated, and this frustration interferes in the recognition process (since, for example, users often repeat their previous utterance in a way that the system is less likely to understand), making the situation worse [11]. • Correct understanding of the state of the dialogue: a common problem in dialogue systems is that the user doesn’t know whether or not the process is working normally [12]. Zaragoza Del 8 al 10 de Noviembre de 2006 IV Jornadas en Tecnologia del Habla 39