Psychological Review 1976, Vol. 83. No. 4, 310-317 Sentence-Picture Verification Models as Theories of Sentence Comprehension: A Critique of Carpenter and Just M. K. Tanenhaus, J. M. Carroll, and T. G. Sever Psycholinguistics Program, Columbia University We consider several recent information^processing models of sentence-picture matching to assess their implications for sentence processing. The representa- tional component of the models describes a task-specific "verification repre- sentation" that is derived from a more general representation needed for comprehension. The specific models do not describe the processes by which these representations are derived; nor do the assumptions common to the models shed light on the structure of these verification representations. The models are, at best, detailed descriptions of the processes by which subjects verify sentences that they have already understood against pictures that they have already perceived. How do we understand the relevant meaning of sentences used in ordinary contexts? This is a central problem in psychology and a primary pre- occupation of the psycholinguist. One particularly difficult aspect of this problem is that we know very little about the role of contexts in language comprehension. A natural starting point for in- vestigating this question is to limit the context, the sentence, and their relation in an experimental task. Recently, this approach has been applied to sentence verification in tasks in which subjects are asked to judge whether a picture matches a sentence. The verification times are used to mo- tivate "information processing" models that de- scribe what subjects do as a stage-by-stage pro- cess. These models appear to be elegant paradigms for the study of comprehension. They contain both a representational and a processing com- ponent, and they utilize the apparent formalistic rigor of the information-processing approach. In the latest of these proposals, Carpenter and Just (197S) present a broad-ranging information-pro- cessing model that encompasses most of the previ- ous models. In the present paper we examine sev- eral of these models, focusing on Carpenter and Just's interpretation in order to evaluate the ac- tual and potential contribution of models of sen- tence-picture verification tasks to the study of comprehension. We are indebted to Doris Aaronson, David Hakes, Margot Lasher, Zenon Pylyshen, and Michael Valenti for comments on an early draft. Requests for reprints should be sent to Michael K. Tanenhaus, Department of Psychology, Schermerhorn Hall, Columbia University, New York, New York 10027. We demonstrate that the kind of representa- tion modeled in sentence-picture verification ex- periments is conditioned by the verification task. The representational component of the models describes a verification representation that is it- self derived from the more general representation needed for comprehension. The models do not specify the process by which the verification repre- sentation is derived; nor do the common set of as- sumptions claimed for the class of models pro- vide a set of principles for understanding the structure or development of the verification repre- sentations. Finally, we consider several problems that may limit the value of these models for the analysis of cross-modal verification. The Constituent Comparison Model In the Carpenter and Just (1975) model, the picture and the sentence are first represented as embedded propositional structures. The preposi- tional encodings of the picture and the sentence are then compared from the inside out (i.e., be- ginning with the innermost embedded predicates). The subject is assumed to have a conceptual re- sponse index, which is initialized at TRUE. A mis- match between propositions changes the response index (e.g., from TRUE to FALSE) and reinitializes the comparison process. Each comparison of cor- responding constituents (predicates in the picture and sentence representations) adds a constant increment of time. Thus, Carpenter and Just label their model the "constituent comparison model." The verification time for any particular sentence is determined by the value of a parameter k, which is an index of the number of .comparisons of constituents that have been made. The final value of the response index (TRUE or FALSE) de- termines the subject's response. 310