Effect of Various Spatial Auditory Cues on the Perception of
Threat in a First-Person Shooter Video Game
Konstantin Semionov
contact@konstantinsemionov.com
Edinburgh Napier University
Edinburgh, Scotland
Dr Iain McGregor
i.mcgregor@napier.ac.uk
Edinburgh Napier University
Edinburgh, Scotland
ABSTRACT
This study interviewed game audio professionals to establish the
implementation requirements for an experiment to ascertain the
effect of different spatial audio localisation systems on the per-
ception of threat in a first-person shooter. In addition, a listening
study was carried out involving 35 members of the public, and
using three scenes made in Unreal Engine 4 with custom designed
sound, authored in Wwise. This established that spatial audio does,
indeed, have a noticeable effect on players’ perception of threat.
Each spatial audio system, however, had different effects on the
perception of threat, stealth, realism and position estimation, on
all three different visual scenes. Meanwhile, the absence of spatial
audio can confuse people and contribute towards inaccurate enemy
localisation. With this in mind, a tailored approach to the game
design requirements for each project is recommended. Rather than
a single, spatialised design for the entire game, each scene should
have its own design solution.
CCS CONCEPTS
• Applied computing → Sound and music computing; •
Human-centered computing → Human computer interac-
tion (HCI).
KEYWORDS
Sound design, first-person shooter, threat perception, audio im-
plementation, spatial audio
ACM Reference Format:
Konstantin Semionov and Dr Iain McGregor. 2020. Effect of Various Spatial
Auditory Cues on the Perception of Threat in a First-Person Shooter Video
Game. In Proceedings of Audio Mostly (AM’20), September 15–17, 2020, Graz,
Austria. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3411109.
3411119
1 INTRODUCTION
Game developers are implementing more advanced audio options
in their games [18]. However, there is not enough available research
to suggest whether those options enhance gameplay, and if they
cause players to experience the game differently. We cannot, using
existing research, deduce to what extent do spatial localisation
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
AM’20, September 15–17, 2020, Graz, Austria
© 2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-7563-4/20/09. . . $15.00
https://doi.org/10.1145/3411109.3411119
techniques bring benefit to players of FPS games, and if they make
games feel more realistic.
Linking a character’s generated sound to an expectation of threat,
without the threat being visible, is one sound design technique that
can alert a player to a potential threat, and prompt caution in sub-
sequent gameplay [16]. Audio spatialisation techniques are said to
provide a way to identify opponent location accurately [42]. Dis-
covering whether these contribute towards the perception of threat
within a competitive game would be helpful – as could identifying
which are most effective. Localisation and reverberation have been
identified as crucial aspects of a first-person shooter experience
when enemies are present in the game, due to the “hunter and
hunted” nature of the genre [24]. The role of occlusion in spatial
localisation, however, is less clear.
Spatial localisation could also be achieved with the use of
loudspeaker-based surround systems. Two separate experiments,
however, found no significant improvements going from stereo
to surround systems in a third-person computer game on either
speakers or headphones [42, 43]. Binaural reproduction has been
found to convey better spatial attributes compared to stereo for
headphone users [51].
Sound source position in the three-dimensional space is deter-
mined for a listener using horizontal plane azimuth, elevation in
the vertical plane and distance depth. Horizontally, the azimuth
is calculated with information arriving at both ears, using inter-
aural time differences (ITD) and interaural intensity differences
(IID) [44]. Interaural time difference (ITD) is a human mechanism
that localises the source of sound on a horizontal plane, based on
the difference between the nearest ear and the furthest ear away
from the sound source [35]. Studies have found this system is able
to discriminate between angles down to 2° in the 20Hz to 2kHz
range [10]. Similar to ITD, interaural intensity difference (IID) is
a mechanism that helps with the identification of a sound source
position, however, it takes into account difference in sound source
volume (intensity) between the ears [35]. Compared to ITD, IID
provides better localisation for frequencies above 3 kHz [44].
Vertical localisation is a result of absorption, reflection and
diffraction effects, caused by the pinna of the ear as well as the
head, shoulders and chest [35, 44]. These spectral cues are called
head- related transfer functions or HRTFs and can vary greatly
between individual people due to differences in the shape and size
of the outer ear [44]. HRTF-based vertical sound localisation has
been found to function monoaurally in humans for high frequency
(more than 7 kHz) content [14]. HRTF-based 3D audio renderers
are used in video games to deliver a mixdown of various sound
sources within the game. However, it has been suggested that more