3D ROOM GEOMETRY ESTIMATION FROM MEASURED IMPULSE RESPONSES Sakari Tervo and Timo Tossavainen Aalto University School of Science Department of Media Technology P.O. Box 15400, FI00076 Aalto ABSTRACT Estimation of the room geometry from spatial room impulse re- sponses is studied. An algorithm for estimating the geometry is presented. The algorithm does not require any a priori information on the room shape, number of walls, or order of the reﬂections, but deduces the set of planes that explain the measured source and image-source locations and covariances iteratively. The algorithm is demonstrated with real data experiments. Index Terms— Room geometry estimation, room impulse re- sponse, reﬂection 1. INTRODUCTION The geometry of the room is one of the most essential parts of room acoustic modeling. Besides the prediction of the acoustics of rooms, the room acoustic models can be used for example to enhance source localization performance [1]. Estimation of the room geometry can be divided into three subtopics, localization of reﬂections, i.e. the image-sources, estima- tion of the surface parameters, i.e plane points and normals, and the estimation of room geometry. In principle, any general localization method can be used to localize the reﬂections. As an example, in [2] reﬂections are localized using sound intensity vectors and time of arrival (TOA). The locations of the reﬂections can be used together with the es- timated or a priori known source location to deduce the surface pa- rameters. This requires the knowledge of the order of the reﬂections. Plane parameters are estimated in [3] by rotating a B-format micro- phone around a loudspeaker, directed towards the microphone. The estimation is based on the TOA and the direction of arrival (DOA) of the ﬁrst arriving reﬂection in each direction. The TOA and DOA measurements are grouped using hierarchical clustering to avoid es- timating the same plane multiple times. Moreover, in [4] the plane parameters are estimated with a common tangent algorithm. The same approach is applied in [5] and several other publications for the estimation of plane parameters. The actual room geometry estimation algorithms combine the locations of reﬂections and source as well as the orders of the reﬂec- tions. One such algorithm, which uses only one room impulse re- sponse, has been proposed in [6]. The algorithm requires the knowl- edge of the order of the ﬁrst and second order reﬂections and of their arrival times. Moreover, in [7] a constrained room model and l1- regularized least-squares method is applied to ﬁt 3-D shoebox model to a set of measured impulse responses. The number of walls is as- sumed to be known a priori. In addition, in [3] the clustering of This work was supported by ERC grant agreement no. [203636], HECSE, and Nokia Foundation the TOA and DOA measurements constitutes as the room geometry estimation algorithm. The geometry estimation presented in [3], as well as in [4] and [5], use the assumption that all the detected reﬂec- tions are of ﬁrst order. To the understanding of the present authors all the previous approaches use a priori information either on the number of the walls, shape of the enclosure, or on the order of the reﬂections. Especially the a priori assumption on the order of the re- ﬂections is not feasible, since in most of the practical situations the earliest arriving second order reﬂection arrives before the latest ﬁrst order reﬂection. Here a room geometry estimation algorithm is proposed that is able to deduct the room geometry without any of the above listed a priori information. The algorithm deduces iteratively the set of planes that has produced a set of estimated reﬂection locations and covariances. Rest of the article is organized as follows. Section 2, presents the estimation of the reﬂection locations and of their covari- ance matrices from the spatial room impulse responses. In Section 3, the geometry that explains the estimated locations and covariances of the reﬂections is estimated with an iterative maximum likelihood algorithm. Experiments are conducted with real data in Section 4. Section 5 discusses the results and concludes the article. 2. ESTIMATION OF SOURCE AND REFLECTION LOCATIONS 2.1. Reﬂection signal model In this paper, a room impulse response measured with a microphone at location rn and a loudspeaker at location x is considered as a sum of the direct sound and individual reﬂections: hn(t) △ = h(rn, x; t)=  K  k=0 h k,n (t)  + wn(t) =  K  k=0  ∞ -∞ H k,n (ω)e jωt dω   + wn(t), (1) where t is time, ω is angular frequency, n is the index for micro- phone, k =0 is the direct sound, k =1,...,K are the reﬂec- tions, wn(t) is measurement noise independent for each microphone and of the signal and distributed according to normal distribution for each microphone. Moreover, h k,n (t) and H k,n (ω) are the time and the frequency domain presentation of the direct sound and of the reﬂections. The applied microphone array is assumed to have a small aper- ture size compared to the dimensions of the room. Then the impulse responses can be divided into short time windows which each in- clude only one reﬂection. In realistic situations this is true for the 513 978-1-4673-0046-9/12/$26.00 ©2012 IEEE ICASSP 2012