Lossy Event Compression based on Image-derived Quad Trees and Poisson Disk Sampling Srutarshi Banerjee Zihao W. Wang Henry H. Chopp, Oliver Cossairt Aggelos K. Katsaggelos Northwestern University srutarshibanerjee2022@u.northwestern.edu Abstract With several advantages over conventional RGB cam- eras, event cameras have provided new opportunities for tackling visual tasks under challenging scenarios with fast motion, high dynamic range, and/or power constraint. Yet unlike image/video compression, the performance of event compression algorithm is far from satisfying and practical. The main challenge for compressing events is the unique event data form, i.e., a stream of asynchronously fired event tuples each encoding the 2D spatial location, timestamp, and polarity (denoting an increase or decrease in bright- ness). Since events only encode temporal variations, they lack spatial structure which is crucial for compression. To address this problem, we propose a novel event compression algorithm based on a quad tree (QT) segmentation map de- rived from the adjacent intensity images. The QT informs 2D spatial priority within the 3D space-time volume. In the event encoding step, events are first aggregated over time to form polarity-based event histograms. The histograms are then variably sampled via Poisson Disk Sampling pri- oritized by the QT based segmentation map. Next, differen- tial encoding and run length encoding are employed for en- coding the spatial and polarity information of the sampled events, respectively, followed by Huffman encoding to pro- duce the final encoded events. Our Poisson Disk Sampling based Lossy Event Compression (PDS-LEC) algorithm per- forms rate-distortion based optimal allocation. On average, our algorithm achieves greater than 6× higher compression compared to the state of the art. 1. Introduction Inspired by biological visual systems, event cameras are novel sensors designed to capture visual information with a data form drastically different from traditional images and videos [38, 27]. The event pixels do not directly out- put the intensity signals as traditional cameras do. Instead, each pixel compares the difference between the current log- intensity state and the previous state, and fires an event when the difference exceeds the firing positive or negative thresholds. This sensing mechanism provides several ben- efits. First, event pixels operate independently which en- ables very low latency (∼ 10μs) and therefore high speed imaging. Second, event cameras have high dynamic range (HDR, ∼ 120dB) compared to regular frame-based cam- eras (∼ 60dB). Third, the events reduce redundant captures of static signals. Last, event cameras consume lower power (10mW) than traditional cameras (∼ 1W). As such, event cameras have brought new solutions to many classical as well as novel problems in computer vision and robotics, in- cluding high frame-rate video reconstruction [46, 34, 37], with HDR [43, 33] and high resolution [44, 28, 45], and 3D reconstruction of human motion [49] and scenes [31, 23], as well as odometry [11, 41] and tracking [52, 24]. Currently, events are mainly communicated in the raw format using the Address Event Representation (AER) pro- tocol [1]. The current AER protocol, AEDAT 4.0, released in July 2019, [1] uses 96 bits representation for each event tuple (x, y, t, p) which are (x, y) position, timestamp and polarity, while its earlier version, AEDAT 3.1, uses 64 bit representation for each event. The timestamp uses the most bits due to its resolution with 64 bits and 32 bits for AEDAT 4.0 and AEDAT 3.1, respectively. Although AEDAT 4.0 has incorporated lossless encoding standards such as LZ4, LZ4 HIGH, ZSTD and ZSTD HIGH [1], effective lossy event encoding has not been proposed in the literature or implemented in the event cameras. In traditional image/video compression standards, lossy compression is achieved by exploiting the spatial and tem- poral correlations. However, events are discrete points scat- tered in the space-time volume (see, for example Fig. 1 left). Several prior works have approached event compres- sion [8, 16, 21]. TALVEN [21] has aimed at aggregating events based on event timestamps. While this improves the compression ratio (CR), the benefits of high compression is only evident when aggregating events over a long time du- 1 arXiv:2005.00974v2 [cs.CV] 1 Dec 2020