Motion extrapolation of auditory–visual targets Sophie Wuerger a, * , Georg Meyer a , Markus Hofbauer b , Christoph Zetzsche c , Kerstin Schill c a School of Psychology, Eleanor Rathbone Building, Bedford Street South, University of Liverpool, Liverpool L69 7AZ, United Kingdom b Neurologische Klinik, Ludwig-Maximilians-Universität, Marchioninistraße 23, 80377 München, Germany c Kognitive Neuroinformatik, Universität Bremen FB3, Enrique–Schmidt–Straße 5, 28359 Bremen, Germany article info Article history: Received 30 January 2007 Received in revised form 18 October 2007 Accepted 1 April 2009 Available online 22 April 2009 Keywords: Sensory integration Auditory and visual motion Motion extrapolation Temporal localisation Optimal integration abstract Many tasks involve the precise estimation of speed and position of moving objects, for instance to catch or avoid objects that cohabit in our environment. Many of these objects are characterised by signal rep- resentations in more than one modality, such as hearing and vision. The aim of this study was to inves- tigate the extent to which the simultaneous presentation of auditory and visual signals enhances the estimation of motion speed and instantaneous position. Observers are asked to estimate the instant when a moving object arrives at a target spatial position by pressing a response button. This task requires observers to estimate the speed of the moving object and to calibrate the timing of their manual response such that it coincides with the true arrival time of the moving object. When both visual and auditory motion signals are available, the variability in estimating the arrival time of the moving object is signif- icantly reduced compared to the variability in the unimodal conditions. This reduction in variability is consistent with optimal integration of the auditory and visual speed signals. The average bias in the esti- mated arrival times depends on the motion speed: for medium speeds (17 deg/s) observers’ subjective arrival times are earlier than the true arrival times; for high speeds (47 deg/s) observers exhibit a (much smaller) bias in the other direction. This speed-dependency suggests that the bias is due to an error in estimating the motion speeds rather than an error in calibrating the timing of the motor response. Finally, in this temporal localization task, the bias and variability show similar patterns for motion defined by vision, audition or both. Ó 2009 Elsevier B.V. All rights reserved. 1. Introduction Significant progress has been made in understanding how sig- nals from the auditory and visual modalities are combined for per- ceptual tasks such as spatial localization [2,4,7,14,30], the detection and extraction of motion [1,5,17,18,20,23,24], bimodal synchrony and grouping [13,15,19,21] or visual search [9]. Most studies on auditory–visual motion processing have focused on the detection of global motion embedded in noise [1,5,17] or motion biases intro- duced by one modality on the other modality (e.g. [23,24]). How- ever, little is known about how humans integrate visual and auditory information in biologically relevant tasks where motion speed as well as instantaneous position have to be estimated from the auditory and visual modalities to initiate motor commands. To study motion extrapolation based on bimodal speed information we use a temporal localisation task, where the subject has to predict when the moving object arrives at a spatial target location. When the target is defined visually, human observers can accu- rately point to the extrapolated final position of a moving target when feedback is given [6,22]. How this sensory signal is computed is still a matter of debate. Since the extrapolation of the final posi- tion of a target requires a correct estimation of an instantaneous position as well as the extraction of speed, it is clear that both spa- tial and temporal mechanisms must be involved in this task [29]. There is some evidence that the integration of auditory and visual motion signals occurs before speed is calculated, i.e. within spatial and temporal mechanisms [16]. This is consistent with the idea that the modality that is most reliable for a particular task will dominate the performance in this task [26,28]. More recently this hypothesis of ‘modality appropriateness’ has been formulated in terms of a quantitative framework [4,8,14,30] and numerous experimental studies have demonstrated its wide applicability to a variety of tasks (e.g. [7]). In this study we are concerned with temporal localisation per- formance based on the integrated speed signals from the auditory and visual modality and we will address the following questions: (i) how the simultaneous availability of motion speed estimates from two modalities affects the performance in the temporal local- isation task, and (ii) whether localization errors are similar in the auditory, visual and bimodal condition. 1566-2535/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.inffus.2009.04.005 * Corresponding author. Tel.: +44 151 794 2173; fax: +44 151 794 2945. E-mail address: s.m.wuerger@liverpool.ac.uk (S. Wuerger). Information Fusion 11 (2010) 45–50 Contents lists available at ScienceDirect Information Fusion journal homepage: www.elsevier.com/locate/inffus