Visual Attention: Detecting Abrupt Onsets within the Selective Tuning Model zyx John K. Tsotsos, Sean M. Culhane, Winky Yan Kei Wai Dept. of Computer Science, University of Toronto, Toronto, Ontario, Canada M5S 1A4 tsotsos @vis.toronto.edu Abstract zyxwvutsr This paper focuses on one dimension of our model of visual attention, namely the detection and quantification of abrupt onsets and offsets. The overall model is based on the concept of selective tuning. The goal of the research is to develop a model zyxwvut of visual attention that has both biological plausibility as well as computational utility. Abrupt onsets are well-known attention capture cues and play a large role not only in signaling salient events in everyday life, but also figure prominently in most psychophysical experimental paradigms. Our solution is simple, easily parallelized, yields excellent pe$ormance, and provides useful robot head control cues for onset foveation. The model is described in some detail and several performance examples are shown. A description of the implementation is also included. 1.0 Introduction This paper presents a model of visual attention based on the concept of selective tuning. The goal of the research is to develop a model of visual attention that has both biological plausibility as well as computational utility. In this paper we focus on one dimension of the model, namely the detection and quantification of abrupt onsets and offsets. The central thesis of our research is that attention acts to optimize the search procedure inherent in a solution to vision whether that solution is implemented in the brain or in a computer. This model of attention addresses the reduction of the number of candidate image subsets and of feature subsets that are considered in matching; it does so by selectively tuning the visual processing network. Computational arguments linking search optimization to attention for vision and the concept of attentive selective tuning first appeared in zyxwvut [ 11. Attention operates continuously and automatically: without attention, so-called general purpose vision is not possible. The model is most closely related to [2,3,4,5]. 1.1 The Need for Attention in Vision As argued in [6], selective attention is one of the important mechanisms for dealing with the combinatorial aspects of search in vision. The visual attention mechanism seems to involve at least the following basic components: i) the selection of a region of interest in the visual field; ii) the selection of feature dimensions and values of interest; iii) the control of information flow through the network of processes that constitutes the visual system; and iv) the shifting from one selected region to the next in time. These are briefly discussed in turn below; solutions are proposed for some of them in [7]. Other aspects of attention such as the transformation of task information into attentional instructions, integration of successive attentional fixations, interactions with memory and indexing into model bases are not addressed here. 1.1.1 The Need for Region of Interest Selection. In [8], it was proved that visual search, in the case where explicit targets are given in advance, has time complexity which is linear in the size of the image (and this linear response time vs. display size is verified psychophysically in a large body of work). If, on the other hand, no explicit target is provided, the task is NP- Complete. Thus, it may be concluded that the brain is not solving this general problem [6,9]. The intractability is due solely to the combinatorial nature of selecting which parts of the input image are to be processed; there are an exponential number of such image subsets. Attentional selection may determine which mapping to attempt to verify first; if the first such mapping selected is a good one, a great deal of search can be avoided, otherwise there is the potential for a very inefficient search process. For sufficiently small images and/or sufficiently massive 0-8186-7134-3/95 $04.00 zyxwvutsrq 0 1995 IEEE 76