This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 50, NO. 9, SEPTEMBER 2015 1 A 30 µW 30 fps 110 × 110 Pixels Vision Sensor Embedding Local Binary Patterns Andrew Berkovich, Student Member, IEEE, Michela Lecca, Leonardo Gasparini, Member, IEEE, Pamela A. Abshire, Member, IEEE, and Massimo Gottardi, Member, IEEE Abstract—We present a 110 × 110 pixel vision sensor that com- putes the Local Binary Patterns (LBPs) of an imaged scene with a power consumption of 30 µW at 30 fps. The LBP of a given pixel is a binary vector, encoding the direction and sign of image contrast with respect to its neighbors. Each LBP provides a visual descrip- tion of an image's local structure that is widely used for texture and object recognition. In the sensor proposed here, each pixel de- tects its corresponding LBP with respect to its four neighboring pixels and saves this information into a digital map using 6 bits to encode each pixel. The operation is executed during the exposure time and requires 83 pW/pixel frame to be computed. The chip is implemented in a 0.35 µm CMOS featuring 34 T square pixels with 26 µm pitch. We illustrate some examples of image descrip- tion based on the LBPs output by the sensor. Index Terms—Active pixel sensors, image sensors, local binary pattern, low-power vision, visual processing. I. INTRODUCTION U LTRA-LOW power imaging has become a challenging topic for many applications such as mobile devices, wireless sensor networks and wearable electronics. While solu- tions for limiting the power consumption of singular electronic components have been adopted, bringing vision technology to ultra-low power consumption is still an open issue [1], [2]. In fact, vision systems often require large data acquisition and massive parallel processing, which hardly ﬁt with low power consumption. Minimizing the power consumption of individual components does not sufﬁciently reduce the power budget of such a system. A more effective solution adopts an integrated approach—one that merges the image acquisition and image processing tasks which are usually separated in standard vision systems. Hardware and software integration is particularly convenient in the case of parallel image processing, i.e. when the system needs to repeat the same operations over each pixel. Embedding this visual processing in hardware, instead Manuscript received February 04, 2015; revised May 06, 2015; accepted June 05, 2015. This paper was approved by Associate Editor Gyu-Hyeong Cho. The work was supported in part by the Project EnerViS, “Energy Autonomous Low- Power Vision Systems,” within the Provincia Autonoma di Trento and Univer- sity of Maryland R&D Cooperation Program 2012. M. Lecca, L. Gasparini, and M. Gottardi are with Fondazione Bruno Kessler, Povo (TN), Italy (e-mail: lecca@fbk.eu; gasparini@fbk.eu; gottardi@fbk.eu). A. Berkovich and P. A. Abshire are with the Department of Electrical and Computer Engineering, Institute for Systems Research, University of Maryland, College Park, MD 20740 USA (e-mail: asb77@umd.edu; pabshire@umd.edu). Color versions of one or more of the ﬁgures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identiﬁer 10.1109/JSSC.2015.2444875 Fig. 1. An example of LBP with radius and . The picture reports the gray values of the central pixel (in red) and of its 4 neighbors (in black). The neighbors have been sampled in clockwise order: the number (from 0 to 3) close to each pixel indicates the sampling order. The corresponding LBP is and its rotation invariant version is . See text and Eq. (1) for more details. of software, leads to a more efﬁcient, custom system with reduced power consumption. Using this approach, signals are often ﬁltered and/or binarized either at the pixel-level [3], [4] or array-level [5] and thus have the additional beneﬁt of not needing high performance A/D converters. The main drawback is that a custom sensor heavily limits system ﬂexibility, i.e. the information delivered by the sensor cannot be used for any visual task. In general, a good trade-off between energy efﬁciency and usability must be deﬁned taking into account the application scenarios. In this work, we describe the architecture of a low power vision sensor that embeds the computation of the local binary patterns (LBPs) of each pixel in a cluster of neighboring pixels (Fig. 1). The LBPs are visual features measuring a binary, direc- tional, spatial contrast in a neighborhood of each pixel. They de- scribe image micro-structures, such as edges, corners, lines, or ﬂat regions, and are invariant against changes in light intensity. Usually, they are normalized to be insensitive to in-plane rota- tions. Originally introduced to describe image textures [6], the LBPs are widely applied to many computer vision tasks, such as face analysis and detection [7], [8], ﬁngerprints recognition [9], video background subtraction [10], and image retrieval [11]. For these applications, embedding the LBPs' computation in hard- ware reduces the computational load of the processor and the energy consumption of the entire system. Our vision sensor captures images with a 110 110 pixel array and has a power consumption of 30 W at 30 fps. Each pixel detects its corresponding LBP with respect to its four neighboring pixels and stores this information into a digital map using 6 bits to encode each pixel. The operation is executed during the exposure time and requires 83 pW/pixel frame 0018-9200 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.