Artificial Intelligence Review (1987) 1, 139-157 What we say and what we mean A. Ramsay Cognitive Studies Program, University of Sussex, Falmer BNI 9QN, UK Abstract. The general problem of natural language processing has not been solved, and may never be. Nonetheless, there are now a number of well- known techniques for certain aspects of the task; and there is a certain amount of agreement about what other problems need to be tackled, if not about how to tackle them. The current paper gives a survey of what we do know, and indicates the areas in which further progress remains to be made. Linguistic and non-linguistic knowledge Understanding and generating natural language seems to call upon two sorts of knowledge. It requires a substantial amount of knowledge about language itself - what strings of sounds and letters are in fact words of the given language, what sequences of words are well-formed sentences, how do different word-orders encode different messages, and so on. And it requires a similar amount of knowl- edge about the world in general - what sorts of things can a listener generally be expected to know in advance so they need not be mentioned explicitly, what can the partners in a conversation reasonably be expected to infer from what has been said already, what names do people use for referring to things and each other, and so on. These two sorts of knowledge might be seen as being concerned with what we say, and what we mean by it, respectively. The purely linguistic knowledge specifies how strings of sounds or characters encode messages; the general world knowledge is used for working out what actions are appropriate given the message encoded by what we have just heard or read. They also correspond, to some extent, to a split between what we do and do not know how to make computers do. I would not want to suggest that the linguistic questions are solved, or anything like it, but there do certainly seem to be some theories about what needs to be done and how to do it. We have computer programs which, for non-trivial fragments of the lower levels of language processing, seemto work. As far as the use of world knowledge is concerned, we are in a much weaker position. We do know something about what needs to be done, about what sort of knowledge is required and when, but we have very few practical theories about how to deliver it. This will become apparent as we progress through our survey of the components of natural language processing systems (NLP systems). 139