Biol. Cybernetics 32, 211 216 (1979) Biological Cybernetics 9 by Springer-Verlag1979 The Internal Representation of Solid Shape with Respect to Vision J. J. Koenderink and A. J. van Doom Department of Medical and PhysiologicalPhysics,Physics Laboratory, State University Utrecht, Utrecht, The Netherlands Abstract. It is argued that the internal model of any object must take the form of a function, such that for any intended action the resulting reafference is predict- able. This function can be derived explicitly for the case of visual perception of rigid bodies by ambulant observers. The function depends on physical causation, not physiology; consequently, one can make a priori statements about possible internal models. A pos- teriori it seems likely that the orientation sensitive units described by Hubel and Wiesel constitute a physiological substrate subserving the extraction of the invariants of this function. The function is used to define a measure for the visual complexity of solid shape. Relations with Gestalt theories of perception are discussed. 1. Introduction In order to survive in its environment an organism requires an internal representation of the causal tex- ture (Tolman and Brunswik, 1935) of the world. Such a model enables the organism to anticipate the probable course of events and to organize its behaviour ef- ficiently. A model must provide an efficient taxonomy of "objects" in order to describe the environment economically, taylored to the organism's likely sensory experience (Humphrey, 1973). If an organism takes some action, then its sensory inflow as the result of the action will be determined by physical laws. There will be a functional dependence of the inflow I on the voluntary act A (the principle of reafference). This dependence will be different for different objects, we have I= V(A ; < fi, ~.... ), (1) where cr fi, 7.... denote invariances of the object. The most complete model the organism can have of the object is F(... ; ~, fi, ~,...), (2) with the first slot left open. The organism is then in a position to predict the result of all its potential actions when dealing with the object. Thus the status of internal models is similar to that of the laws of physics. Take for instance Hooke's law : the elongation of a spring is proportional to the tension. Our internal representation of the concept of elasticity of a spring can hardly be anything else than : the result of my action to increase my muscular force x times, will be an elongation of the spring of x times. Note that we here made an a priori statement about our possible internal model of "elasticity". We cannot make any a priori statements about the way such a function is biologically implemented, e.g. as a table or as an algorithm. 1.1. The Principle Applied to Vision Consider the case of a cyclopean observer with a pointlike eye (e.g. the first nodal point) in an environ- ment of rigid objects (generalizations are easily made). We are concerned here with the veridical perception of three-dimensional shape, that is with the extensive properties of objects, not with surface colour etc. In this view e.g. a trompe-l'oeil painting is to be considered just a flat surface covered with pigments. If the observer perceives something different we will label his percept illusory. It is clear that the usual pattern- recognition approach to perception would lead to a plethora of illusions in such a case. In fact humans easily fall prey to such illusions, except when they are ambulant (which is the reason why the most successful trompe-l'oeil paintings are arranged like peep-hole shows. Pirenne, 1970). The active observer is not deceived. This is because the perspective transfor- mations on his retina can only be interpreted as 0340-1200/79/0032/0211/$01.20