Tracking Objects with Vision Chip using Dynamic Neural Fields Julien Martel Institute of Neuroinformatics Winterthurerstr. 190 8051 Zurich, Switzerland jmartel@ini.ethz.ch Yulia Sandamirskaya Institute of Neuroinformatics Winterthurerstr. 190 8051 Zurich, Switzerland ysandamirskaya@ini.uzh.ch Pjotr Dudek ... ... Manchester, UK larst@affiliation.org ABSTRACT This paper presents a hardware implementation of a Dy- namic Neural Field on the Vision chip to enable robust, fast, and efficient salience-based object tracking. CCS Concepts •Computer systems organization → Embedded sys- tems; Redundancy; Robotics; •Networks → Network reli- ability; Keywords Dynamic Neural Fields; Vision chip; tracking 1. INTRODUCTION Tracking objects based on visual input is a computation- ally challenging task that requires processing of every cam- era frame at a high speed and thus relies on a high dataflow rate between the sensor and the processing unit. The data transfer from the sensor to the processor thus becomes the bottlneck for the speed and quality of tracking. Deporting the tracking algorithm to the sensor itself solves this problem by offloading computing operations on the high-resolution image data to the perifery. Since tracking is a non-local operation, such offloading typically would compromise the form-factor and power consumption of the sensor. Here, we propose an algorithm for tracking that is based on a local operation only, performed in parallell on vision chip ... using biologically inspired framework of Dynamic Neural Fields... Dynamical systems perspective... 2. METHODS 2.1 Dynamic Neural Fields A Dynamic Neural Field (DNF), Eq. (1), is a mathemati- cal description of activity patterns that have been observed in neuronal populations in visual cortex [6, 1]: Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. ICDSC 2016 Paris, France c 2016 ACM. ISBN 978-1-4503-2138-9. DOI: 10.1145/1235 τ ˙ u(x, t)= -u(x, t)+ h + f ( u(x ′ ,t) ) ω(x, x ′ )dx ′ + S(x, t). (1) Here, u(x) is the activation function of a DNF defined over a parameter space, x, which describes the state of the system (e.g., if x is the image space, u(x) can express the salience, observed for each position in the image). -h is a negative resting level that sets values of u(x) to be below the output threshold in absence of external input. S(x) is the external input (e.g. from a vision sensor). f (u) is a sig- moidal non-linearity, Eq. (2), that shapes the output of the DNF: the output is zero for values of u(x) that are negative and the output is positive for positive u(x), saturating for larger values of u(x). f ( u(x, t) ) = (1 + e −βu(x,t) ) −1 . (2) ω(x, x ′ ) is the interaction kernel that determines connec- tivity between positions x and x ′ on the DNF. Typically, the interaction kernel has a “Mexican hat” shape with a short- range excitation and a long-range inhibition implementing a soft winner-take-all connectivity pattern, Eq. (3): ω(x, x ′ )= cexce − (x−x ′ ) 2 2σ 2 exc - c inh e − (x−x ′ ) 2 2σ 2 inh . (3) Fig. ?? shows a simulated 2-dimensional DNF that re- ceives a bimodal input. As a model of neuronal activity, the DNFs abstract away both the discreteness of individual neurons and the spiking nature of propagation of activity between them and instead represent neuronal population by a continuous activation function in the space of parameters, encoded by the neurons [5]. In this simplified form, the neuronal dynamics can be easily related to perceptual and motor parameters that are measured in behavioral experiments with humans or animals and thus close the gap between neuronal models and the be- havioral outcome. Consequently, DNFs have been used to model perceptual and cognitive processes, revealed in be- havioral experiments [4, ?]. From a technological perspective, the DNF dynamics has the following properties that might be exploited in vision applications: (1) low-pass filter characteristics of the DNF dynamics suppresses noise in the sensory input; (2) local lateral excitation leads to stabilisation of a localised-peak attractor of the DNF, leading to a “detection instability” which marks detection of an coherent sensory object in the