arXiv:1506.05869v2 [cs.CL] 23 Jun 2015 A Neural Conversational Model Oriol Vinyals VINYALS@GOOGLE. COM Google Quoc V. Le QVL@GOOGLE. COM Google Abstract Conversational modeling is an important task in natural language understanding and machine in- telligence. Although previous approaches ex- ist, they are often restricted to specific domains (e.g., booking an airline ticket) and require hand- crafted rules. In this paper, we present a sim- ple approach for this task which uses the recently proposed sequence to sequence framework. Our model converses by predicting the next sentence given the previous sentence or sentences in a conversation. The strength of our model is that it can be trained end-to-end and thus requires much fewer hand-crafted rules. We find that this straightforward model can generate simple con- versations given a large conversational training dataset. Our preliminary suggest that, despite op- timizing the wrong objective function, the model is able to extract knowledge from both a domain specific dataset, and from a large, noisy, and gen- eral domain dataset of movie subtitles. On a domain-specific IT helpdesk dataset, the model can find a solution to a technical problem via conversations. On a noisy open-domain movie transcript dataset, the model can perform simple forms of common sense reasoning. As expected, we also find that the lack of consistency is a com- mon failure mode of our model. 1. Introduction Advances in end-to-end training of neural networks have led to remarkable progress in many domains such as speech recognition, computer vision, and language processing. Recent work suggests that neural networks can do more than just mere classification, they can be used to map com- Proceedings of the 31 st International Conference on Machine Learning, Lille, France, 2015. JMLR: W&CP volume 37. Copy- right 2015 by the author(s). plicated structures to other complicated structures. An ex- ample of this is the task of mapping a sequence to another sequence which has direct applications in natural language understanding (Sutskever et al., 2014). One of the major advantages of this framework is that it requires little feature engineering and domain specificity whilst matching or sur- passing state-of-the-art results. This advance, in our opin- ion, allows researchers to work on tasks for which domain knowledge may not be readily available, or for tasks which are simply too hard to model. Conversational modeling can directly benefit from this for- mulation because it requires mapping between queries and reponses. Due to the complexity of this mapping, conver- sational modeling has previously been designed to be very narrow in domain, with a major undertaking on feature en- gineering. In this work, we experiment with the conversa- tion modeling task by casting it to a task of predicting the next sequence given the previous sequence or sequences using recurrent networks (Sutskever et al., 2014). We find that this approach can do surprisingly well on generating fluent and accurate replies to conversations. We test the model on chat sessions from an IT helpdesk dataset of conversations, and find that the model can some- times track the problem and provide a useful answer to the user. We also experiment with conversations obtained from a noisy dataset of movie subtitles, and find that the model can hold a natural conversation and sometimes per- form simple forms of common sense reasoning. In both cases, the recurrent nets obtain better perplexity compared to the n-gram model and capture important long-range cor- relations. From a qualitative point of view, our model is sometimes able to produce natural conversations. 2. Related Work Our approach is based on recent work which pro- posed to use neural networks to map sequences to se- quences (Kalchbrenner & Blunsom, 2013; Sutskever et al., 2014; Bahdanau et al., 2014). This framework has been used for neural machine translation and achieves im-