Toward a Programmatic Semantics of Natural Language Hugo Liu and Henry Lieberman MIT Media Laboratory, Massachusetts Institute of Technology {hugo, lieber}@media.mit.edu Abstract Natural language is imbued with a rich semantics but unfortunately its complex elegance is often mistaken for mere imprecision. Because complete parsers of English are not yet achievable, people assume that it is not feasible to use English directly as a means of instructing computers. However, in this paper, we show that English descriptions of procedures often contain programmatic semantics – linguistic features that can be easily mapped into programming language constructs. Some linguistic features can even inspire new ways of thinking about specifying programs. Far from being hopelessly ambiguous, natural languages exhibit important principles of communication that could be used to make human-computer communication more natural. 1. English as a Programming Language Natural human language is often dismissed as being too informal and ambiguous to compute with and program in because it does not obey the rigor of logic. Rather than relying on absolute truth and deduction, natural language and human reasoning rely on abduction, or evidentiary reasoning. By modeling abduction probabilistically, it may be possible then, to create quasi-formalisms for natural language. Recently, we investigated the feasibility of enabling programming in natural language [1] by reviewing Pane and Myer’s study of fifth graders’ plain English descriptions of a Pacman game [3]. We then implemented a system called Metafor, capable of rendering simple children stories like “Humpty Dumpty” as “scaffolding code” (descriptive, but not complete enough to execute) in Python. The surprising conclusion of these exercises was that in most cases, fairly direct mappings are possible from parsed English to the control and data structures of traditional high-level programming languages like LISP and Python, suggesting that even a naïve, direct transliteration of English into programmatic code is unexpectedly good. If mapping difficulties can be attributed to anything, it is that English often exhibits some very desirable programming semantics that are either not yet available in programming languages, or have only recently become available as advanced features. Drawing from the findings of our feasibility study and implementation of Metafor, this paper aims to elucidate some of the basic programmatic features of natural language with English as the exemplar. 2. Programmatic Features of English In this section, we rely on salient examples to illustrate the mappings from natural language syntax and semantics to programming constructs. Examples are drawn from the Pacman domain, given in [3]. Syntactic Typing. For the most part, the major syntactic parts-of-speech of natural language correspond to distinct syntactic types in programming languages. Action, or non-copular, verbs (everything except verbs like to be, and to seem) map to functions (e.g. “eat”, “chase”), while noun phrases map to classes (e.g. “the maze”, “dots”). Adjectival modifiers map to properties of a class (e.g. “yellow blinking dots”), and adverbial modifiers map to auxiliary arguments to functions (e.g. “chomps dots quickly” chomp(dot,speed=quickly)). A class can be further distinguished into active agents imbued with methods, or passive structures with only properties. A noun- phrase that is a subject to an action verb is usually an agent (e.g. “Ghosts chase”). Conversely, if it only plays object to an action verb (e.g. “eats dots”) or subject to a copular-passive verb (e.g. “dots are eaten”), this is more suggestive of a passive structure. Prepositional phrases attached to a verb usually play the role of function argument (e.g. “eats dot by chomping”), and the nuanceful ways that different prepositions imply different sorts of arguments (e.g. governing dimensions of duration, speed, manner, etc.) are well beyond the scope of this paper. Inheritance. Natural language relies heavily on inheritance from known exemplars. In fact, we can view the primary purpose of having massive amounts of “common sense” knowledge about the everyday world as building up a sufficiently rich programmatic library of exemplars from which we can inherit during story understanding. Linking a novel concept to known exemplars can be done explicitly, but also implicitly via structural abduction. First, explicit inheritance is Proceedings of the 2004 IEEE Symposium on Visual Languages and Human Centric Computing (VLHCC’04) 0-7803-8696-5/04 $ 20.00 IEEE