Motion, Stereo and Color Analysis for Dynamic Virtual Environments Tamer F. Rabie Demetri Terzopoulos Department of Electrical and Computer Engineering Department of Computer Science Ryerson Polytechnic University University of Toronto 350 Victoria Street 6 King’s College Road Toronto, Ontario, M5B 2K3, Canada Toronto, Ontario, M5S 3H4, Canada e-mail: tamer@ee.ryerson.ca e-mail: dt@cs.toronto.edu Abstract We develop a vision system for highly mobile autonomous agents that is capable of dynamic obstacle avoidance and active perception. We demonstrate the robust performance of the system in artificial animals with directable, foveated eyes, situated in physics-based virtual worlds. Through ac- tive perception, each agent controls its eyes and body by continuously analyzing photorealistic binocular retinal im- age streams. The vision system estimates optical flow, com- putes stereo disparity and segments looming targets in the low-resolution visual periphery while controlling eye move- ments to track an object fixated in the high-resolution fovea. It matches segmented targets against mental models of col- ored objects of interest in order to decide whether the seg- mented objects are harmless or represent dangerous obsta- cles. The latter are localized, enabling the artificial animal to exercise the sensorimotor control necessary to support complex behaviors, such as predation, and obstacle avoid- ance. Keywords: Active Vision; Active Perception; Virtual Re- ality; Virtual Robotics; Multiagent Systems. 1 Introduction Animals are active observers of their environment [14]. This fact has inspired a trend in computer vision popularly known as “active vision” [3, 4]. Our recently proposed ani- mat vision paradigm offers a new approach to developing bi- ologically inspired active vision systems and experimenting with them [29]. Rather than allow the limitations of available robot hardware to hamper research, animat vision prescribes the use of virtual robots that take the form of realistic arti- ficial animals, or animats, situated in physics-based virtual worlds. Animats are autonomous virtual agents possessing highly mobile, muscle-actuated bodies and brains with mo- tor, perception, behavior and learning centers. In the percep- tion center of the animat’s brain, computer vision algorithms continually analyze incoming perceptual information. Based on this analysis, the behavior center dispatches motor com- mands to the animat’s body, thus forming a complete senso- rimotor control system. Motion and color play an important role in animal per- ception. Birds and insects exploit optical flow for obsta- cle avoidance and to control their ego-motion [14]. Some species of fish are able to recognize the color signatures of other fish and use this information in certain piscene behav- iors [1]. The human visual system is highly sensitive to mo- tion and color. We tend to focus our attention on moving colorful objects. Motionless objects whose colors blend in to the background are not as easily detectable, and several camouflage strategies in the animal kingdom rely on this fact [12]. Biological creatures move through the world with little apparent effort. Many do so using eyes with a high-acuity fovea covering only a small fraction of a visual field whose resolution decreases monotonically towards the periphery. Spatially nonuniform retinal imaging provides opportunities for increased computational efficiency through economiza- tion of photoreceptors and focus of attention, but it forces the visual system to solve problems that do not generally arise with a uniform field of view. A key problem is deter- mining how to deal with objects that are detected in the low resolution periphery while focusing attention on an object of interest fixated in the high resolution fovea. In this paper we present a solution to this problem through the combined ex- ploitation of color, motion and depth information from stereo disparity. Building upon the animat vision paradigm, the stereo, motion and color based motor and gaze control algorithms that we propose in this paper are implemented and evalu- ated within artificial fishes in a virtual marine world. The fish animats are the result of research in the domain of arti- ficial life (see [30] for the details). In the present work, the fish animat serves as an autonomous mobile robot inhabiting a photorealistic, dynamic environment. Our new navigation algorithms significantly enhance the prototype animat vision system implemented in prior work [28, 22, 29, 23]. They support in the artificial fishes more robust vision-guided nav- igation, including obstacle recognition and avoidance. We briefly review the animat vision system in the next section before presenting, in the subsequent sections, our work on integrating motion, stereo disparity and color analysis for an- imat navigation and perception. 2 A Prototype Animat Vision System The basic functionality of the animat vision system, which is described in detail in [28, 29], starts with binocular perspective projection of the color 3D world onto the ani- mat’s 2D retinas. Retinal imaging is accomplished by pho-