Spike Events Processing for Vision Systems R. Serrano-Gotarredona 1 , T. Serrano-Gotarredona 1 , A. Acosta-Jiménez 1 , A. Linares- Barranco 2 , G. Jiménez-Moreno 2 , A. Civit-Balcells 2 , and B. Linares-Barranco 1 1 Instituto de Microelectrónica de Sevilla (IMSE-CSIC), Ed. CICA, Av. Reina Mercedes s/n, 41012 Sevilla, Spain. 2 Dpto. Arquitectura de Computadores, University of Sevilla, Sevilla, Spain Abstract In this paper we brieﬂy summarize the fundamental properties of spike events processing applied to artiﬁcial vision systems. This sensing and processing technology is capable of very high speed throughput, because it does not rely on sensing and processing sequences of frames, and because it allows for complex hierarchically structured cortical-like layers for sophisticated processing. The paper includes a few examples that have demonstrated the potential of this technology for high- speed vision processing, such as a multilayer event processing network of 5 sequential cortical-like layers, and a recognition system capable of discriminating propellers of different shape rotating at 5000 revolutions per second (300000 revolutions per minute). I. Introduction Artiﬁcial man-made machine vision systems operate in a quite different way to biological brains. Machine vision systems usually operate by capturing and processing sequences of frames. For example, a video camera captures images at about 25-30 frames per second, which are then processed frame by frame to extract, enhance and combine features, and perform operations in feature spaces, until a desired recognition is achieved. Biological brains do not operate on a frame by frame basis. In the retina, each pixel sends spikes (also called events) to the cortex when its activity level reaches a threshold. This activity level may respond to different image properties like intensity, contrast, color, motion, etc. - properties which have been pre-computed within the retina before generating the spikes to be sent to the visual cortex. Very active pixels will send more spikes than less active pixels. When the retina responds to a stimulus, for example a moving proﬁle, then those pixels sensing the proﬁle will elicit a simultaneous collection of spikes which are strongly space-time correlated. The visual cortex receiving these spikes is sensitive to the space location where the spikes were originated and to the relative timing between them. This way, it can recognize and follow this moving proﬁle. All these spikes are transmitted as they are being produced, and do not wait for an artiﬁcial “frame time” before sending them to the next processing layer. This way, in biological brains, strong features are propagated and processed from layer to layer as soon as they are produced, without waiting to ﬁnish collecting and processing data of whole image frames. As an illustration, consider the setup in Fig. 1. On the left, a circular solid object (a ball) is observed by a motion sensing retina in the center. The pixels in this retina are sensitive to motion (changes in intensity). Consequently, at a given instant in time only the pixels on a circumference will become active. This means that the pixels on the same circumference will simultaneously ﬁre spikes. Let us assume each pixel ﬁres just one single spike. We may state that, at a given instant (or short time interval), the spikes produced by the retina are highly space-time correlated: in time because they are simultaneous and in space because they form a circumference of a certain radius. In Fig. 1, the output spikes of the retina are sent, through projection ﬁelds, onto the next processing layer. Suppose the projection ﬁelds are tuned to detect circumferences of a given radius range . Then, each spike produced by a pixel in the retina will be sent to a circumference (of radius R) of pixels in the projection-ﬁeld layer in Fig. 1. This way, pixel ‘1’ in the retina sends a spike to all pixels in circumference ‘1’ of the projection-ﬁeld layer. The same for pixels ‘2’, ‘3’, ‘4’, and all others in the retina circumference. If the circumference sensed in the retina is of the same radius R than the projection-ﬁelds, as is the case in Fig. 1, then the pixel in the projection ﬁeld layer that has the same coordinates as the central pixel of the retina circumference (pixel ‘A’), will receive spikes from all active projection-ﬁelds. Consequently, this pixel will receive the strongest stimulus. The pixels in the projection-ﬁeld layer can be made to ﬁre a spike if their stimulus reaches a certain threshold. If this threshold is sufﬁciently high, only the central pixel ‘A’ in the projection ﬁeld layer will generate an output, signaling that this is the center of the moving ball of radius R sensed by the retina. In general, projection- ﬁelds in biological neuro-cortical layers perform feature Moving Ball Motion Retina Projection Processor Sensing Field (Convolution) A 1 2 4 2 3 4 1 3 Fig. 1: Example of high-speed projection-ﬁeld spike-based image processing for detecting a moving ball of a speciﬁc radius Moving Ball Motion Sensitive Retina Projection Field (Convolution) Processor R ε ±