Real-Time Target Localization and Tracking by N-Ocular Stereo Takushi Sogo Department of Social Informatics Kyoto University Sakyo-ku, Kyoto 606–8501, Japan Hiroshi Ishiguro Department of Computer and Communication Sciences Wakayama University 930 Sakaedani, Wakayama 640–8510, Japan Mohan M. Trivedi Department of Electrical and Computer Engineering University of California, San Diego La Jolla, CA 92093–0407, U.S.A. Abstract In recent years, various practical systems using mul- tiple vision sensors have been proposed. In this paper, as an application of such vision systems, we propose a real-time human tracking system consisting of multiple om- nidirectional vision sensors (ODVSs). The system measures people’s locations by N-ocular stereo, which is an exten- sion of trinocular stereo, from omnidirectional images taken with the ODVSs. In addition, the system employs several compensation methods for observation errors in order to achieve robust measurement. We have evaluated the pro- posed methods in the experimentation using four compact ODVSs we have originally developed. 1. Introduction Recent progress of multimedia and computer graphics is developing practical applications based on simple com- puter vision techniques. Especially, the practical approach recently focused on is to use multiple vision sensors with simple visual processing. For example, several systems track people or automobiles in the real environment with multiple vision sensors [1, 2, 10, 13, 14] and other systems analyze their behaviors and so on [3]. Compared with systems using a single vision sensor [6, 9, 12, 21], these systems enable to observe a moving target in a large space for a long time. However, they need to use many vision sensors to seamlessly cover the environment since a single standard vision sensor itself has a narrow range of view. On the other hand, an omnidirectional vision sensor (ODVS) provides a wide range of view. In addition, use of multi- ple ODVSs provide rich and redundant visual information, Figure 1. Compact omnidirectional vision sensor which enables robust recognition of the targets. Thus, mul- tiple ODVSs opens up a new application area of computer vision with their wide range of view. As an application of such vision systems, we propose a real-time human tracking system using multiple ODVSs [16]. We have originally developed low-cost and compact ODVSs as shown in Figure 1 [7] and used for this research. The system detects people, measures azimuth angles with the ODVSs, and determines their locations by triangulation as shown in Figure 2 (a). In this system, the following problems in the stereo using ODVSs (called omnidirectional stereo) should be considered: Correspondence problem among multiple targets Measurement precision of target locations The former problem also occurs in conventional stereo using two or more vision sensors [11, 14, 17]. However, in our system it is more difficult to verify the correspondence of targets with visual feature, since the baseline of ODVSs 0-7695-0704-2/00 $10.00 ã 2000 IEEE