A Prototype Natural Language Interface for Animation Systems Diana Inkpen and Darren Kipp University of Ottawa, School of information Technology and Engineering diana@site.uottawa.ca, dkipp076@uottawa.ca Abstract We present a prototype implementation of a natural language interface to an animation system. The interface provides the means for a human user to issue commands in natural language to an avatar in a virtual reality environment. The purpose of our system is to convert the input text into commands in an animation script language and execute them. Our system uses a general-purpose parser and a domain- specific semantic interpreter based on pattern matching. 1. Introduction This paper presents a prototype implementation of a natural language interface to an animation system. The main components of the system are: a parser, a semantic interpreter, and a command interpreter. The architecture of the system is presented in Figure 1 and explained in detail in section 3. The parser is a general- purpose natural-language parser [3]; it transforms each input sentence into a parse tree. The semantic interpreter takes as input the parse tree and generates commands in an animation script language. The command interpreter executes the animation script using the animation module, which in our case is very simple; it will be replaced by an anthropomorphic avatar [5] in future work. The semantic interpreter is the core of our system. It directs sub-trees of the parse tree to the appropriate sub-modules. The verb phrases are sent to the “action processor”, which locates the main verb and identifies the command to be generated. The other sub-trees are sent to the “details processor”, which uses a pattern matcher to identify values for attributes of the commands, in the sub-trees. When certain attribute values are not specified, default values are used. For example, if the avatar is told to run without specifying how fast, a default speed is used. When a sentence contains a conjunction of two verb phrases, two 1 consecutive commands are generated. Their attributes come from the details processor, with attributes in the sub-tree of each verb phrase allowed to overwrite attributes from higher levels in the parse tree. This allows us to capture the correct syntactic and semantic dependencies. An example of input and output to the system is the following (the output format is explained in details in section 6): Input: John, walk five steps to the right. Output: walk speed=5 direction=right repetition=5; The system is designed to be easily extended to accept other types of sentences, without modifying the code, but only the text files used as parameters by the semantic interpreter. More patterns (phrase and sentence structures) can be added to the parameter file of the details processor. New commands can be created by adding more verbs and their synonyms to the parameter file of the action processor. In order to accommodate more than one avatar, another attribute can be added to all commands, to store the name of the avatar. 2. Related work There is a lot of related work in natural language interfaces. The two main directions are: more or less ad-hoc systems for specific domains (see [1] [2] for 1 Copyright 2002 IEEE. Published in the Proceedings of HAVE/2002, Nov., 2002 in Ottawa, ON, Canada. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 732-562-3966.