Neuronal convergence in early contrast vision: Binocular summation is followed by response nonlinearity and area summation School of Life and Health Sciences, Aston University, Birmingham, UK Tim S. Meese School of Life and Health Sciences, Aston University, Birmingham, UK Robert J. Summers We assessed summation of contrast across eyes and area at detection threshold (C t ). Stimuli were sine-wave gratings (2.5 c/deg) spatially modulated by cosine- and anticosine-phase raised plaids (0.5 c/deg components oriented at T45-). When presented dichoptically the signal regions were interdigitated across eyes but produced a smooth continuous grating following their linear binocular sum. The average summation ratio (C t1 /([C t1+2 ]) for this stimulus pair was 1.64 (4.3 dB). This was only slightly less than the binocular summation found for the same patch type presented to both eyes, and the area summation found for the two different patch types presented to the same eye. We considered 192 model architectures containing each of the following four elements in all possible orders: (i) linear summation or a MAX operator across eyes, (ii) linear summation or a MAX operator across area, (iii) linear or accelerating contrast transduction, and (iv) additive Gaussian, stochastic noise. Formal equivalences reduced this to 62 different models. The most successful four-element model was: linear summation across eyes followed by nonlinear contrast transduction, linear summation across area, and late noise. Model performance was enhanced when additional nonlinearities were placed before binocular summation and after area summation. The implications for models of probability summation and uncertainty are discussed. Keywords: vision, masking, contrast gain control, area summation, spatial summation, binocular summation, psychometric function Citation: Meese, T. S., & Summers, R. J. (2009). Neuronal convergence in early contrast vision: Binocular summation is followed by response nonlinearity and area summation. Journal of Vision, 9(4):7, 1–16, http://journalofvision.org/9/4/7/, doi:10.1167/9.4.7. Introduction The initial stages of vision decompose the two retinal images into local estimates of feature dimensions such as contrast, size, and orientation. However, (i) normal observers experience a unitary (binocular) vision of the world and (ii) the world contains spatially extensive surfaces and textures whose projections exceed the foot- prints (receptive fields) of the local retinal analyses, at least up to layer 4 of V1. Neuronal convergence across space and eyes is a necessary condition for building binocular object representations from local monocular measures, but what is the form of the convergence, and how is it organized? One way in which this can be investigated is to measure contrast detection thresholds as a function of the dimension of interest, and assess the level of improvement against various models of the process (Foley, Varadharajan, Koh, & Farias, 2007; Kersten, 1984; Meese, Georgeson, & Baker, 2006; Robson & Graham, 1981; Watson, 1979). However, a difficulty is that the number of model parameters or potential architectures is not necessarily well constrained by this approach. For example, in experiments that increase the size of a patch of grating placed in the central visual field, potentially confounding variables include: the level of noise, retinal inhomogeneity, and uncertainty. Untangling these parameters poses a serious challenge to interpretation of this kind of experiment. In a recent study, Meese and Summers (2007) intro- duced a stimulus set that was designed to overcome these problems in the spatial (area) domain. The basic idea was to use a grating-type stimulus with a constant diameter to encourage contrast integration (by whatever means) over the same retinal mechanisms in all conditions. If this could be achieved, then it seemed likely that this would control all of the problems outlined above. But how can the diameter of the stimulus be fixed, while allowing its area to be varied? The answer was to cut holes in the stimulus, or more accurately, to attenuate interdigitated patches of the stimulus. Example stimuli are shown in Figure 1. Figure 1a is a sine-wave grating that has been modulated by a ‘raised plaid’ (see Methods section for details) in cosine phase with the center of the stimulus. Figure 1b is similar, but the modulation is in anticosine phase. These stimuli were given the nominal titles of ‘white’ Journal of Vision (2009) 9(4):7, 1–16 http://journalofvision.org/9/4/7/ 1 doi: 10.1167/9.4.7 Received January 14, 2008; published April 6, 2009 ISSN 1534-7362 * ARVO