Data Mining to Support Human-Machine Dialogue for Autonomous Agents Susan L. Epstein 1 , Rebecca Passonneau 2 , Tiziana Ligorio 1 , Joshua Gordon 3 1 Hunter College and The Graduate Center of The City University of New York, Department of Computer Science, New York, NY 2 Center for Computational Learning Systems, Columbia University, New York, NY 3 Department of Computer Science, Columbia University, New York, NY {susan.epstein@hunter.cuny.edu, becky@cs.columbia.edu, tligorio@gc.cuny.edu, joshua@cs.columbia.edu} Abstract. Next-generation autonomous agents will be expected to converse with people to achieve their mutual goals. Human-machine dialogue, however, is challenged by noisy acoustic data, and by people’s preference for more natural interaction. This paper describes an ambitious project that embeds human subjects in a spoken dialogue system. It collects a rich and novel data set, including spoken dialogue, human behavior, and system features. During data collection, subjects were restricted to the same databases, action choices, and noisy automated speech recognition output as a spoken dialogue system. This paper mines that data to learn how people manage the problems that arise during dialogue under such restrictions. Two different approaches to successful, goal-directed dialogue are identified this way, from which supervised learning can predict appropriate dialogue choices. The resultant models can then be incorporated into an autonomous agent that seeks to assist its user. Keywords: spoken dialogue systems; Wizard of Oz; human-machine interaction. 1 Introduction A spoken dialogue system (SDS) is an autonomous agent that communicates with people in the way most natural to them — through spoken language. In pursuit of a common goal, however, such an agent must not only speak to the person, but also listen. People want human-machine dialogue to be both successful (accomplish their goal) and habitable (demonstrate people’s tacit knowledge about how dialogue should be conducted). To that end, this paper studies how one person manages to help another achieve a goal during interactions that simulate dialogue between a human and an autonomous agent. The primary result reported here is the identification of two different strategies for effective, goal-directed dialogues. One is service oriented, and