1 Cooperative Dialogue Planning with User and Situation Models via Example-based Training Ian R. Lane, Shinichi Ueno, Tatsuya Kawahara School of Informatics, Kyoto University Yoshida-Hommachi, Sakyo-ku, Kyoto 606-8501, Japan e-mail: {ian,kawahara}@ar.media.kyoto-u.ac.jp Keywords: Dialogue Planning, User Modeling, Machine Learning, Spoken Dialogue System Abstract To provide a high level of usability, spoken dialogue systems must generate cooperative responses for a wide variety of users and situations. We introduce a dialogue planning scheme which incorporates user and situation models, making such dialogue adaptation possible. Manually developing a set of dialogue rules to accommodate all possible model combinations is very diﬃcult and obstructs system portability. To overcome this problem, we propose a novel example-based training scheme, where example dialogues from a role- playing simulation are used to train the dialogue planner via machine learning. The proposed scheme is evaluated on the Kyoto city voice portal, a multi-domain spoken dialogue system. Subjects participated in a role-playing simulation where they selected appropriate system responses at each dialogue turn based on given scenario information. Experimental results show that the system successfully trains the dialogue planner and provides reasonable system performance. 1 Introduction The continual improvement of speech recognition and mobile communication technologies has enabled the development of interactive voice response (IVR) systems that allow users to obtain a variety of information via mobile phone based voice interfaces. However, such systems are typically diﬃcult to operate for non-experts, and do not provide cooperative dialogue. Whether a system is cooperative to a user depends on user characteristics, such as whether the user is a novice, or in a hurry, and other external factors including time of day. For a spoken dialogue system to interact cooperatively with a user, such information must be considered during dialogue planning and response generation. Previous research includes several methods to adapt dialogue strategies based on various cues [1, 2, 3, 4, 5]. Factors used for adaptation include, user knowledge level in the target domain [6, 7] and skill level using the system [8]. External information such as time of day and user location was incorporated in a mobile navigation system in [9]. These studies, however, typically focus on only single factors and modeling is generally task dependent. In order to generate truly cooperative responses, multiple factors must be considered simultaneously during dialogue planning. In this paper, we present a comprehensive modeling scheme to generate user and situation-adapted responses for spoken dialogue systems. As domain independent user characteristics, skill level to the system, degree of hastiness, and dialogue goal clarity are used and detected in real time. External factors including time of day, location of the place of interest, and external events that may aﬀect the task are also taken into account. These models provide non-linguistic information that enables detailed user and situation speciﬁc dialogue plans to be generated. The main problem in implementing a dialogue management scheme incorporating the above models is plan complexity. Manually generating an optimal set of dialogue rules to account for all possible model combinations would be very diﬃcult, and there is no guarantee that these rules would generate optimal dialogue ﬂows. To overcome this problem we introduce a machine learning approach to dialogue planning.