A Novel Sound Localization Experiment for Mobile Audio Augmented Reality Applications Nick Mariette Audio Nomad Group, School of Computer Science and Engineering University of New South Wales, Sydney, Australia nickm@cse.unsw.edu.au Abstract. This paper describes a subjective experiment in progress to study human sound localization using mobile audio augmented reality systems. The experiment also serves to validate a new methodology for studying sound localization where the subject is outdoors and freely mobile, experiencing virtual sound objects corresponding to real visual objects. Subjects indicate the perceived location of a static virtual sound source presented on headphones, by walking to a position where the auditory image coincides with a real visual object. This novel response method accounts for multimodal perception and interaction via self-motion, both ignored by traditional sound localization experiments performed indoors with a seated subject, using minimal visual stimuli. Results for six subjects give a mean localization error of approximately thirteen degrees; significantly lower error for discrete binaural rendering than for ambisonic rendering, and insignificant variation to filter lengths of 64, 128 and 200 samples. 1 Introduction Recent advances in consumer portable computing and position sensing technologies enable implementation of increasingly sophisticated, light-weight systems for presenting augmented reality (AR) and mixed reality (MR) environments to mobile users. Greater prevalence of this technology increases the potential for more common usage of AR/MR as a form of ubiquitous computing for information and entertainment applications. Furthermore, audio-only AR/MR applications allow for less encumbered use than visual AR/MR applications, since the output device is a set of headphones, which are less intrusive and more familiar to the general public than visual devices such as the head mounted display (HMD). The concept of audio augmented reality, proposed at least as early as 1993 [1], is to present an overlay of synthetic sound sources upon real world objects that create aural and/or visual stimuli 1 [2]. Also in 1993, even before the completion of the Global Positioning System (GPS), the concept was proposed to use GPS position tracking in a personal guidance system for the visually impaired, by presenting the user with 1 In this paper, the augmentation of real visual stimuli with virtual sound will be considered audio AR, although existing definitions of AR and MR are not clear with regards to cross- sensory stimuli for the real and virtual components of the user’s environment [2].