IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 2, FEBRUARY 2013 439 Acoustic Source Localization With Distributed Asynchronous Microphone Networks A. Canclini, F. Antonacci, A. Sarti, Member, IEEE, and S. Tubaro, Member, IEEE Abstract—We propose a method for localizing an acoustic source with distributed microphone networks. Time Differences of Arrival (TDOAs) of signals pertaining the same sensor are estimated through Generalized Cross-Correlation. After a TDOA ﬁltering stage that discards measure- ments that are potentially unreliable, source localization is performed by minimizing a fourth-order polynomial that combines hyperbolic constraints from multiple sensors. The algorithm turns to exhibit a signiﬁcantly lower computational cost compared with state-of-the-art techniques, while retaining an excellent localization accuracy in fairly reverberant conditions. Index Terms—Source localization, distributed microphone arrays, hy- perbolas intersection. I. INTRODUCTION The problem of acoustic source localization has signiﬁcantly evolved with technological needs. The literature, in fact, is rich with localization solutions that adapt to various operating conditions. Particularly interesting are those that add the range to the vector of un- knowns, with the result of linearizing the source localization problem and improving both accuracy and computational efﬁciency [1]–[3]. In the past few years, however, there has been a growing interest for spatial distributions of independent (unsynchronized) acoustic sensors, each made of two or more synchronized microphones. Examples of solutions in the literature ﬁrst compute the Global Coherence Field (GCF) [4] or Steered Response Power (SRP) [5] maps associated to all the microphone pairs over a spatial grid and then localize the source as the peak of the cumulative global map, with overall computational costs that are often too demanding for the application at hand. Better computational efﬁciency is achieved in [6] where the SRP algorithm accommodates a different computation over a coarser grid. Alternate approaches based on Least Squares (LS) were proposed in [7], which proved efﬁcient for compact arrays but with a certain sensitivity to environmental noise; and in [8] where a Stochastic Region Contraction of the grid was proposed, adopting a multi-resolution approach. In this article we propose a novel solution that is suitable for spa- tial distributions of sensors and requires a modest computational load without giving up on localization accuracy and robustness. We consider spatially distributed sensors, each with synchronized micro- phones. Different sensors are assumed as independently clocked and placed in space in an unconstrained but known fashion. Their loca- tion, in fact, can be accurately estimated using any of the self-calibra- tion methods that are available in the literature [9]–[12]. The reduc- tion in the computational cost of localization, which is important for balancing the distribution of computational load among sensors of a network, is achieved by using Time Differences Of Arrival (TDOAs) Manuscript received April 13, 2012; revised June 22, 2012; accepted August 11, 2012. Date of publication August 28, 2012; date of current version December 21, 2012. This work was supported by the EU FET-Open SCENIC project under GA 226007. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Boaz Rafaely. The authors are with the Dipartimento di Elettronica ed Informazione, Politecnico di Milano, 20133, Milan, Italy (e-mail: canclini@elet.polimi.it; antonacci@elet.polimi.it; sarti@elet.polimi.ittubaro@elet.polimi.it). Digital Object Identiﬁer 10.1109/TASL.2012.2215601 between microphones of the same sensor. A preliminary removal of TDOA outliers is performed by computing a reliability index based on the ratio between direct-path and reverberant components of the GCC. Surviving TDOAs are then turned into geometric constraints on the source location (hyperbolas, whose foci are on the microphone loca- tions) at a node demanded with the localization (central node). A novel global cost function is then deﬁned, which combines such constraints at best. Localization is then performed by minimizing the corresponding fourth-order polynomial. In order to test the robustness of the system against measurement errors, in this paper we use the error propagation analysis introduced in [13] for theoretically characterizing the performance that can be achieved with a given conﬁguration of sensors. Robustness against this uncertainty is tested using Monte Carlo simulations as well, after pre- liminary self-calibration based on [12]. The paper is structured as follows: Section II introduces the problem and the related notation. Section III describes how TDOAs are con- verted into hyperbolic constraints. Section IV illustrates the localiza- tion technique and assesses its computational cost. Section V describes the simulation setup and the results obtained. II. NOTATION AND COMPUTATION OF TDOAS For the sake of simplicity, in this paper we describe a method that is suitable for 2D geometries. A generalization to the 3D case, however, would be rather straightforward and would not result in a signiﬁcant growth of computational cost. In the case of grid-based localization techniques such as GCF and LS, a generalization to the 3D case would result in a signiﬁcantly higher computational cost. Let us consider a spatial distribution of sensors, each accommo- dating microphones. Let , be the coordinates of each microphone, and be the corresponding acquired signal. For the sake of notational simplicity, and with no loss of generality, let us assume that each sensor has the same number of microphones. We also assume that the signals ac- quired by the microphones of the same sensor are synchronized, while no such assumption is made between different sensors. The acoustic source is located at . The localization is performed over chunks of data extracted with a length- rectangular window. Over the duration of this window, we assume that the source will not move signiﬁcantly. The GCC-PHATs [4], [6], [8] between all possible pairs of microphones are com- puted on the same chunk of data. In order to keep the computational cost down, data processing is restricted to only when the source is found to be active, using the method in [14]. Given , the discrete Time Difference Of Arrival is estimated as [4] where the maximum TDOA is determined by the distance be- tween the two microphones. Notice that, in principle, for each sensor we do not need to compute more than GCCs. According to the outlined application scenario, however, the computational power available at each sensor might be limited. The localization algorithm proposed in this paper is therefore ﬂexible enough to accommodate a variable number of TDOAs. In the presence of reverberations, some of the peaks in re- lated to reﬂective paths might have a magnitude that turns out to be comparable to or larger than that of the direct path. In this situation we expect the estimation of the direct-path discrete TDOA to be unreli- able. For this reason, we propose a reliability index that computes the 1558-7916/$31.00 © 2012 IEEE