2D Sound Source Localization on a Mobile Robot with a Concentric Microphone Array Yoko Sasaki Tokyo University of Science, Chiba, Japan/ The National Institute of Advanced Industrial Science and Technology, Tokyo, Japan y-sasaki@aist.go.jp Yuki Tamai Tokyo University of Science, Chiba, Japan/ The National Institute of Advanced Industrial Science and Technology, Tokyo, Japan y-tamai@aist.go.jp Satoshi Kagami The National Institute of Advanced Industrial Science and Technology, Tokyo, Japan/ Tokyo University of Science, Chiba, Japan s.kagami@aist.go.jp Hiroshi Mizoguchi Tokyo University of Science, Chiba, Japan/ The National Institute of Advanced Industrial Science and Technology, Tokyo, Japan hm@rs.noda.tus.ac.jp Abstract - The purpose of this paper is to report the devised arrangement of a microphone array suitable for a mobile robot and to develop a robotic audition system to recognize the environment. The paper first describes the Sum and Delay Beam Forming (SDBF) algorithm and its common problem: side lobes. The array we developed shows smaller side lobes when beam forming. It provides high quality localization and separation for multiple sound sources. Then we achieved a sound sources mapping system by using a wheeled robot equipped with the microphone array. The robot localizes sound direction on the run and estimates sound positions using triangulation. Accumulation of data results in high accuracy. The system can estimate 3 different pressure sounds with a 200mm position error. Moreover, the high quality sound source separation has proved useful in improving speech recognition. Keywords: Microphone array, sound source localization, mobile robot. 1 Introduction Audition and vision are the most effective sensors for recognition. Vision can provide detailed information but the visual field is narrow. On the other hand, auditory systems give rough but omni-directional information. Such information is available when the signal is hidden from sight such as in a concealed place or at night. The work presented in this paper aims at developing a robotic audition system with functions such as environment recognition by sounds or hands free speech communication with mobile robots. A robotic audition system must work in real time and be robust to the variation of environmental conditions, ambient noise and measurement uncertainties. Furthermore, to make possible interesting applications, such as human speech recognition, it must be able to localize and separate sound sources within a sufficiently large frequency-domain, while rejecting other signals. In this sense narrowband methods appear to be of limited interest. A large amount of robotic audition systems [1], [2] involve only two microphones. These methods are inspired by the human audition system and rely on the difference in phase (IPD) and intensity level (IID) between the two microphones, to localize one or two sound sources in a prescribed frequency band. Applications of these methods to control a mobile robot have shown the difficulties in transferring theoretical principles to a practical environment. Also these methods usually use the head related transfer function (HRTF) to calculate IID and IPD. However, as the determination of the HRTF requires precise measurement in an anechoic room, the HRTF-based application for mobile robots turns out to be limited in natural environments. Even though the search for developing ear-like auditory systems is rather challenging today, the use of microphone arrays including many sensors is still effective to increase the resolution of the localization procedure and its robustness to ambient noise. Moreover, the study of sensor arrays has been researched for a long time in acoustics. Recent work has proven that its application for mobile robots can be very efficient, and motivates our choice of an array-based approach. The auditory system for our wheeled robot is based on a circular microphone array and uses the classical beam forming algorithm. This method inspired our past study of a speaker array system [5], which can generate a sound spot using beam forming techniques. 2 Multiple sounds localization 2.1 Delay and Sum Beam Forming Most sound source localization methods come down to two types of beam forming: focus and null. We use the