1057-7149 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2018.2867171, IEEE Transactions on Image Processing IEEE TRANSACTIONS ON IMAGE PROCESSING 1 Temporal Colored Coded Aperture Design in Compressive Spectral Video Sensing Kareth Le´ on-L´ opez, Student Member, IEEE, Laura Galvis, and Henry Arguello, Senior Member, IEEE Abstract—Compressive spectral video sensing (CSVS) systems obtain spatial, spectral, and temporal information of a dynamic scene through the encoding of the incoming light rays by using a temporal-static coded aperture (CA). CSVS systems use CAs with binary entries spatially distributed at random. The randomly spatial encoding of the binary CAs entails a poor quality in the reconstructed images even though the CSVS sensing matrix is incoherent with the sparse representation basis. Additionally, since some pixels are totally blocked, information such as object motion is missed over time. This work substitutes the temporal- static binary coded apertures by a richer spatio-spectro-temporal encoding based on selectable color ﬁlters, named temporal colored coded apertures (T-CCA). The spatial, spectral and time distributions of the T-CCAs are optimized by better satisfying the restricted isometry property (RIP) of the CSVS system. The RIP optimized T-CCAs lead to spatio-spectral-time structures which tend to sense more uniformly the spatial, spectral and temporal dimensions. An algorithm for optimally designing the T-CCAs is developed. Additionally, a regularization term-based on the scene motion is included in the inverse problem leading to a better quality of the reconstructed images. Computational experiments using four different spectral videos show an improvement of up to 6 dB in terms of PSNR of the reconstructed images by using the proposed inverse problem and the T-CCA patterns compared with the binary CAs, random and image-optimized CCA patterns. Index Terms—Compressive spectral video, optical ﬁlters, col- ored coded apertures, time-varying coded aperture design, re- stricted isometry property. I. I NTRODUCTION S PECTRAL imaging (SI) combines 2D imaging and spec- troscopy to sense spatial information across a large num- ber of wavelengths. The obtained spectral image can be viewed as a 3D datacube with two spatial and one spectral dimensions. Spectral images have been widely used in ﬁelds such as medical diagnosis [1], remote sensing [2], military operations [3], among others. Traditional SI techniques such as push- broom spectral imaging [4] or spectrometers based on optical band-pass ﬁlters [5], require scanning the scene per spatial line or tuning a set of band-pass ﬁlters for each required spectral band, so that the datacube resolution grows linearly in proportion to the desired spatial or spectral resolution and therefore, exponentially growing the amount acquired data and the acquisition costs. On the other hand, other spectral imaging sensors, known as compressive spectral imaging (CSI) Manuscript received December, 2017; revised January 26, 2018. K. Le´ on-L´ opez and H. Arguello are with the Department of System Engineering, Universidad Industrial de Santander, Bucaramanga, Santander, 630002 Colombia (email: henarfu@uis.edu.co). L. Galvis is with the Department of Electrical and Computer Engineering, University of Delaware, Newark, DE, 19716 USA. systems, comprise some optical elements which sense spectral images in a single snapshot exploiting the compressive sensing (CS) theory. In general, CS combines sampling and compres- sion into a non-adaptive single linear measurement process that signiﬁcantly reduces the number of required measurements to reconstruct the multidimensional image [6]. One of the most remarkable CSI systems is the coded aperture snapshot spectral imager (CASSI), that senses the spatial and spectral (3D) information of a scene in a single set of 2D compressive measurements [7]. Even though spectral images are useful in several appli- cations, knowledge of the changes between short periods of time of spectral-frames is valuable in many other applications such as surveillance [8], human tracking [9], and microscopic biological studies [10]. Thereby, not only the spatial and spectral but also the temporal information is nowadays of high interest for the scientiﬁc community [11]. The spatio-spectral information changing at different instants of time is known as dynamic spectral scenes or spectral video. Different works have focused on spectral multiplexing ac- quisition systems to capture spectral video in a compressed format, speciﬁcally, approaches based on compressive sensing, so-called compressive spectral video sensing (CSVS) architec- tures. Architectures such as the coded aperture snapshot spec- tral imager extended to video [11, 12], hybrid spectral video imaging system (HVIS) [13, 14], coded aperture compressive spectral-temporal imaging [15], and high-speed hyperspectral (HSHS) [16] video acquisition have been proposed to sense spectral dynamic scenes. Broadly, CSVS systems rely on coding and dispersion of the incoming light towards the camera sensor. Particularly, in the CASSI system, the encoding step is performed by using a spatial light modulator (SLM) taking fewer measurements than full sampling schemes [11]. Even though the aforementioned devices are appropriate for spectral video acquisition owing to their high-speed and requirement of few amount of measurements [12], these systems use a wavelength-independent coded aperture as an encoding ele- ment whose entries have a random Gaussian or Bernoulli spatial distribution [11–16]. The modulation produced by these coded apertures misuses the richness of the redundancy, and the high correlation between the adjacent frames in a video leading to errors and a poor quality of reconstruction [17, 18]. Moreover, some coded aperture design strategies have been proposed for speciﬁc applications such as mismatching [19], super-resolution [20], and multi-patterned arrays of selectable optical ﬁlters, also called colored coded apertures (CCAs) [18, 21]. Nevertheless, these coded apertures are speciﬁcally