Dynamic voice directivity in room acoustic auralizations Barteld NJ Postma, Brian FG Katz Audio & Acoustic Group, LIMSI, CNRS, Universit´ e Paris-Saclay, Orsay, France. Email: {ﬁrst.lastname}@limsi.fr Introduction The use of room acoustic auralizations has increased due to the improving computing power available and the quality of numerical modelling software. In such aural- izations, it is often possible to prescribe the directivity of an acoustic source in order to better represent the way in which a given acoustic source excites the room. However, such directivities are static, being deﬁned ac- cording to source excitation as a function of frequency for the numerical simulation. While sources such as pianos vary little over the course of playing, it is known that voice directivity varies, sometimes considerably, due to both phoneme dependent radiation patterns [1] linked to changes in mouth geometry and dynamic orientation. Studies by Rindel and Otondo [2, 3] proposed to achieve the inclusion of dynamic vocal/instrumental directiv- ity through the usage of multi-channel source directiv- ity auralization 1 . This method employs anechoic multi- channel recordings. The radiation sphere source is di- vided into segments representing each microphone po- sition. The room impulse response (RIR) is then cal- culated for each segment and convolved with the cor- responding microphone channel of the anechoic record- ing. Convolutions of each channel are then down-mixed to create a multi-channel source directivity auralization. This source representation follows changes in direction, movement, asymmetry, and orientation of the recorded source, unlike simple single channel source representa- tions. Multi-channel source directivity auralizations were subjectively compared to a static directivity source type. The geometrical acoustics (GA) software ODEON was employed to create auralizations of an anechoic clarinet recordings convolved with 2, 5, and 10 channels and a single channel with a static clarinet directivity. Fig. 1 depicts the multi-channel sources which were combined without overlap to represent the spherical recording area around the musician. A listening test compared these au- ralizations in terms of perceived spaciousness of sound in the room and perceived naturalness of timbre of the clar- inet. Results of that study indicated that the 10-channel representation was judged signiﬁcantly less spacious than the three other source representations. Additionally, the test subjects signiﬁcantly preferred the 10-channel aural- ization over the other in terms of perceived naturalness. Vigeant et al. [4] compared 1–, 4–, and 13–channel source directivity auralizations by means of a subjective listen- ing test. The multi-channel source directivity represen- 1 The original paper coined this application multi-channel au- ralization. In order to prevent confusion with distributed sources multi-channel auralizations, this article will employ the term multi- channel source directivity auralization. Figure 1: partial sources used for multi-channel source di- rectivity auralizations (from [2]). tations and employed GA software were the same as the previously mentioned studies. The ﬁrst phase of the test compared the diﬀerent source representations for a vio- lin, trombone, and ﬂute in terms of realism and source size. Subjects rated the 13-channel auralization signiﬁ- cantly more realistic than the other two. No signiﬁcant trend was found regarding source size. In the second phase, the eﬀect of orientation (facing the audience and facing 180 ◦ from the audience) of the 4-channel and 13- channel auralization on Clarity were studied. The results indicated that the 13-channel auralization was perceived clearer when the source faced the audience. No signiﬁ- cant diﬀerence regarding Clarity was observed when the sources faced 180 ◦ from the audience. In contrast to previous studies, the ﬁnal goal of this project is to employ multi-channel source directivity for the inclusion of dynamic source directivity and orien- tation using a single channel anechoic recording. Ad- vantages are a better representation of source directiv- ity, simulations need to be run only once even when the selected instrument is adjusted, and source direc- tivity can be adjusted post-simulation in real-time. A ﬁrst step towards this goal is taken in this study, by perceptually examining the usage of a newly established source decomposition. Where previous studies employed segmented directivity approaches, the current study in- vestigates multi-channel source decomposition using an overlapping beamforming approach, described in Sec. 2. In order to validate this multi-channel source directiv- ity, this source was placed in a GA model based on the Th´ eˆ atre de l’Ath´ en´ ee, created and calibrated according to [5]. The resulting auralizations were compared by means of a subjective listening test to auralizations exploring static directivities. The setup and results of this test are described in Sec. 3. The inclusion of a single channel ane- choic recording into the multi-channel source directivity application is beyond the scope of the current study. DAGA 2016 Aachen 352