IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 21, NO. 2, FEBRUARY 2013 439
Acoustic Source Localization With Distributed
Asynchronous Microphone Networks
A. Canclini, F. Antonacci, A. Sarti, Member, IEEE, and
S. Tubaro, Member, IEEE
Abstract—We propose a method for localizing an acoustic source with
distributed microphone networks. Time Differences of Arrival (TDOAs)
of signals pertaining the same sensor are estimated through Generalized
Cross-Correlation. After a TDOA filtering stage that discards measure-
ments that are potentially unreliable, source localization is performed
by minimizing a fourth-order polynomial that combines hyperbolic
constraints from multiple sensors. The algorithm turns to exhibit a
significantly lower computational cost compared with state-of-the-art
techniques, while retaining an excellent localization accuracy in fairly
reverberant conditions.
Index Terms—Source localization, distributed microphone arrays, hy-
perbolas intersection.
I. INTRODUCTION
The problem of acoustic source localization has significantly
evolved with technological needs. The literature, in fact, is rich with
localization solutions that adapt to various operating conditions.
Particularly interesting are those that add the range to the vector of un-
knowns, with the result of linearizing the source localization problem
and improving both accuracy and computational efficiency [1]–[3].
In the past few years, however, there has been a growing interest for
spatial distributions of independent (unsynchronized) acoustic sensors,
each made of two or more synchronized microphones. Examples of
solutions in the literature first compute the Global Coherence Field
(GCF) [4] or Steered Response Power (SRP) [5] maps associated to all
the microphone pairs over a spatial grid and then localize the source
as the peak of the cumulative global map, with overall computational
costs that are often too demanding for the application at hand. Better
computational efficiency is achieved in [6] where the SRP algorithm
accommodates a different computation over a coarser grid. Alternate
approaches based on Least Squares (LS) were proposed in [7], which
proved efficient for compact arrays but with a certain sensitivity to
environmental noise; and in [8] where a Stochastic Region Contraction
of the grid was proposed, adopting a multi-resolution approach.
In this article we propose a novel solution that is suitable for spa-
tial distributions of sensors and requires a modest computational load
without giving up on localization accuracy and robustness. We consider
spatially distributed sensors, each with synchronized micro-
phones. Different sensors are assumed as independently clocked and
placed in space in an unconstrained but known fashion. Their loca-
tion, in fact, can be accurately estimated using any of the self-calibra-
tion methods that are available in the literature [9]–[12]. The reduc-
tion in the computational cost of localization, which is important for
balancing the distribution of computational load among sensors of a
network, is achieved by using Time Differences Of Arrival (TDOAs)
Manuscript received April 13, 2012; revised June 22, 2012; accepted August
11, 2012. Date of publication August 28, 2012; date of current version December
21, 2012. This work was supported by the EU FET-Open SCENIC project under
GA 226007. The associate editor coordinating the review of this manuscript and
approving it for publication was Prof. Boaz Rafaely.
The authors are with the Dipartimento di Elettronica ed Informazione,
Politecnico di Milano, 20133, Milan, Italy (e-mail: canclini@elet.polimi.it;
antonacci@elet.polimi.it; sarti@elet.polimi.ittubaro@elet.polimi.it).
Digital Object Identifier 10.1109/TASL.2012.2215601
between microphones of the same sensor. A preliminary removal of
TDOA outliers is performed by computing a reliability index based on
the ratio between direct-path and reverberant components of the GCC.
Surviving TDOAs are then turned into geometric constraints on the
source location (hyperbolas, whose foci are on the microphone loca-
tions) at a node demanded with the localization (central node). A novel
global cost function is then defined, which combines such constraints at
best. Localization is then performed by minimizing the corresponding
fourth-order polynomial.
In order to test the robustness of the system against measurement
errors, in this paper we use the error propagation analysis introduced
in [13] for theoretically characterizing the performance that can be
achieved with a given configuration of sensors. Robustness against this
uncertainty is tested using Monte Carlo simulations as well, after pre-
liminary self-calibration based on [12].
The paper is structured as follows: Section II introduces the problem
and the related notation. Section III describes how TDOAs are con-
verted into hyperbolic constraints. Section IV illustrates the localiza-
tion technique and assesses its computational cost. Section V describes
the simulation setup and the results obtained.
II. NOTATION AND COMPUTATION OF TDOAS
For the sake of simplicity, in this paper we describe a method that is
suitable for 2D geometries. A generalization to the 3D case, however,
would be rather straightforward and would not result in a significant
growth of computational cost. In the case of grid-based localization
techniques such as GCF and LS, a generalization to the 3D case would
result in a significantly higher computational cost.
Let us consider a spatial distribution of sensors, each accommo-
dating microphones. Let
, be the coordinates of each microphone, and be
the corresponding acquired signal. For the sake of notational simplicity,
and with no loss of generality, let us assume that each sensor has the
same number of microphones. We also assume that the signals ac-
quired by the microphones of the same sensor are synchronized, while
no such assumption is made between different sensors. The acoustic
source is located at .
The localization is performed over chunks of data extracted with a
length- rectangular window. Over the duration of this window, we
assume that the source will not move significantly. The GCC-PHATs
[4], [6], [8] between all possible pairs of microphones are com-
puted on the same chunk of data. In order to keep the computational
cost down, data processing is restricted to only when the source is found
to be active, using the method in [14]. Given , the discrete Time
Difference Of Arrival is estimated as [4]
where the maximum TDOA is determined by the distance be-
tween the two microphones. Notice that, in principle, for each sensor
we do not need to compute more than GCCs. According
to the outlined application scenario, however, the computational power
available at each sensor might be limited. The localization algorithm
proposed in this paper is therefore flexible enough to accommodate a
variable number of TDOAs.
In the presence of reverberations, some of the peaks in re-
lated to reflective paths might have a magnitude that turns out to be
comparable to or larger than that of the direct path. In this situation we
expect the estimation of the direct-path discrete TDOA to be unreli-
able. For this reason, we propose a reliability index that computes the
1558-7916/$31.00 © 2012 IEEE