Why do people appear not to extrapolate trajectories during multiple object tracking? A computational investigation Sheng-hua Zhong # $ Department of Computing, Hong Kong Polytechnic University, Hong Kong Department of Psychological and Brain Sciences, The Johns Hopkins University, Baltimore, MD, USA Zheng Ma $ Department of Psychological and Brain Sciences, The Johns Hopkins University, Baltimore, MD, USA Colin Wilson # $ Department of Cognitive Science, The Johns Hopkins University, Baltimore, MD, USA Yan Liu # $ Department of Computing, Hong Kong Polytechnic University, Hong Kong Jonathan I. Flombaum # $ Department of Psychological and Brain Sciences, The Johns Hopkins University, Baltimore, MD, USA Intuitively, extrapolating object trajectories should make visual tracking more accurate. This has proven to be true in many contexts that involve tracking a single item. But surprisingly, when tracking multiple identical items in what is known as ‘‘multiple object tracking,’’ observers often appear to ignore direction of motion, relying instead on basic spatial memory. We investigated potential reasons for this behavior through probabilistic models that were endowed with perceptual limitations in the range of typical human observers, including noisy spatial perception. When we compared a model that weights its extrapolations relative to other sources of information about object position, and one that does not extrapolate at all, we found no reliable difference in performance, belying the intuition that extrapolation always benefits tracking. In follow-up experiments we found this to be true for a variety of models that weight observations and predictions in different ways; in some cases we even observed worse performance for models that use extrapolations compared to a model that does not at all. Ultimately, the best performing models either did not extrapolate, or extrapolated very conservatively, relying heavily on observations. These results illustrate the difficulty and attendant hazards of using noisy inputs to extrapolate the trajectories of multiple objects simultaneously in situations with targets and featurally confusable nontargets. Introduction Multiple object tracking (MOT; Pylyshyn & Storm, 1988) is among the most popular and productive paradigms for investigating the underlying nature of visual cognition. In a typical experiment, a set of featurally identical objects moves about a display independently, and the task is to track a subset of the objects that were initially identiﬁed as targets (Figure 1). This task demands sustained effort; it cannot be accomplished via eye movements that shadow the motion of all targets; and basic display factors such as speed, duration, and the numbers of targets and nontargets afford direct and intuitive manipulations of task difﬁculty. The MOT paradigm has proven remarkably useful for identifying general properties of visual processing, such as the utility of inhibition alongside selective attention (Pylyshyn, 2006), the reference frames over which visual cognition operates (Liu et al., 2005), and the underlying units of selective attention (Scholl, Pylyshyn, & Feldman, 2001). Recent advances have added a further dimension to the study of MOT by characterizing the computational problems at the core of the task. Speciﬁcally, visual tracking of multiple objects can be formalized as a Citation: Zhong, S.-h., Ma, Z., Wilson, C., Liu, Y., & Flombaum, J. I. (2014). Why do people appear not to extrapolate trajectories during multiple object tracking? A computational investigation. Journal of Vision, 14(12):12, 1–30, http://www.journalofvision. org/content/14/12/12, doi:10.1167/14.12.12. Journal of Vision (2014) 14(12):12, 1–30 1 http://www.journalofvision.org/content/14/12/12 doi: 10.1167/14.12.12 ISSN 1534-7362 Ó 2014 ARVO Received July 17, 2013; published October 13, 2014