Effect of Various Spatial Auditory Cues on the Perception of Threat in a First-Person Shooter Video Game Konstantin Semionov contact@konstantinsemionov.com Edinburgh Napier University Edinburgh, Scotland Dr Iain McGregor i.mcgregor@napier.ac.uk Edinburgh Napier University Edinburgh, Scotland ABSTRACT This study interviewed game audio professionals to establish the implementation requirements for an experiment to ascertain the effect of different spatial audio localisation systems on the per- ception of threat in a first-person shooter. In addition, a listening study was carried out involving 35 members of the public, and using three scenes made in Unreal Engine 4 with custom designed sound, authored in Wwise. This established that spatial audio does, indeed, have a noticeable effect on players’ perception of threat. Each spatial audio system, however, had different effects on the perception of threat, stealth, realism and position estimation, on all three different visual scenes. Meanwhile, the absence of spatial audio can confuse people and contribute towards inaccurate enemy localisation. With this in mind, a tailored approach to the game design requirements for each project is recommended. Rather than a single, spatialised design for the entire game, each scene should have its own design solution. CCS CONCEPTS Applied computing Sound and music computing; Human-centered computing Human computer interac- tion (HCI). KEYWORDS Sound design, first-person shooter, threat perception, audio im- plementation, spatial audio ACM Reference Format: Konstantin Semionov and Dr Iain McGregor. 2020. Effect of Various Spatial Auditory Cues on the Perception of Threat in a First-Person Shooter Video Game. In Proceedings of Audio Mostly (AM’20), September 15–17, 2020, Graz, Austria. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3411109. 3411119 1 INTRODUCTION Game developers are implementing more advanced audio options in their games [18]. However, there is not enough available research to suggest whether those options enhance gameplay, and if they cause players to experience the game differently. We cannot, using existing research, deduce to what extent do spatial localisation Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. AM’20, September 15–17, 2020, Graz, Austria © 2020 Association for Computing Machinery. ACM ISBN 978-1-4503-7563-4/20/09. . . $15.00 https://doi.org/10.1145/3411109.3411119 techniques bring benefit to players of FPS games, and if they make games feel more realistic. Linking a character’s generated sound to an expectation of threat, without the threat being visible, is one sound design technique that can alert a player to a potential threat, and prompt caution in sub- sequent gameplay [16]. Audio spatialisation techniques are said to provide a way to identify opponent location accurately [42]. Dis- covering whether these contribute towards the perception of threat within a competitive game would be helpful – as could identifying which are most effective. Localisation and reverberation have been identified as crucial aspects of a first-person shooter experience when enemies are present in the game, due to the “hunter and hunted” nature of the genre [24]. The role of occlusion in spatial localisation, however, is less clear. Spatial localisation could also be achieved with the use of loudspeaker-based surround systems. Two separate experiments, however, found no significant improvements going from stereo to surround systems in a third-person computer game on either speakers or headphones [42, 43]. Binaural reproduction has been found to convey better spatial attributes compared to stereo for headphone users [51]. Sound source position in the three-dimensional space is deter- mined for a listener using horizontal plane azimuth, elevation in the vertical plane and distance depth. Horizontally, the azimuth is calculated with information arriving at both ears, using inter- aural time differences (ITD) and interaural intensity differences (IID) [44]. Interaural time difference (ITD) is a human mechanism that localises the source of sound on a horizontal plane, based on the difference between the nearest ear and the furthest ear away from the sound source [35]. Studies have found this system is able to discriminate between angles down to 2° in the 20Hz to 2kHz range [10]. Similar to ITD, interaural intensity difference (IID) is a mechanism that helps with the identification of a sound source position, however, it takes into account difference in sound source volume (intensity) between the ears [35]. Compared to ITD, IID provides better localisation for frequencies above 3 kHz [44]. Vertical localisation is a result of absorption, reflection and diffraction effects, caused by the pinna of the ear as well as the head, shoulders and chest [35, 44]. These spectral cues are called head- related transfer functions or HRTFs and can vary greatly between individual people due to differences in the shape and size of the outer ear [44]. HRTF-based vertical sound localisation has been found to function monoaurally in humans for high frequency (more than 7 kHz) content [14]. HRTF-based 3D audio renderers are used in video games to deliver a mixdown of various sound sources within the game. However, it has been suggested that more