Temporally smooth privacy-protected airborne videos Omair Sarwar 1,2 , Andrea Cavallaro 2 and Bernhard Rinner 1 Abstract— Recreational videography from small drones can capture bystanders who may be uncomfortable about appearing in those videos. Existing privacy ﬁlters, such as scrambling and hopping blur, address this issue through de-identiﬁcation but generate temporal distortions that manifest themselves as ﬂicker. To address this problem, we present a robust spatio- temporal hopping blur ﬁlter that protects privacy through de-identiﬁcation of face regions. The proposed ﬁlter is meant for on-board installation and produces temporally smooth and pleasant videos. We apply hopping blur to protect each frame against identiﬁcation attacks, and minimise artefacts and ﬂicker introduced by the hopping blur. We evaluate the proposed ﬁlter against different identiﬁcation attacks and by assessing the quality of the resulting videos using a subjective test and objective measures. I. INTRODUCTION Recreational videography may capture faces, licence plates, windows of private houses and may therefore lead to discomfort or privacy concerns. To address this problem, privacy ﬁlters are used to modify the appearance of privacy- sensitive image regions [1]–[6]. For example, the appearance of a captured face can be modiﬁed in order to conceal the identity of the person (see Fig. 1). A privacy ﬁlter should cause only a minimal spatio- temporal distortion. However, ﬁlters such as scrambling [3] and hopping blur may generate abrupt changes in the inten- sity values of consecutive frames thus resulting in unpleasant ﬂicker. A privacy-protected video should also prevent person identiﬁcation under different attacks, such as na¨ ıve and parrot attacks. Na¨ ıve attacks compare unprotected gallery faces against privacy-ﬁltered probe faces, whereas parrot attacks de-identify both gallery and probe faces. In addition to the above, na¨ ıve-SR attacks ﬁrst restore (e.g. with super- resolution (SR) [7]) ﬁltered probe faces and then compare them against unprotected gallery faces, and parrot-SR attacks ﬁlter both gallery and probe faces and restore them with super-resolution before comparing them against each other. O. Sarwar was supported by the Erasmus Mundus Joint Doctorate in Interactive and Cognitive Environment, funded by the Education, Audio- visual & Culture Executive Agency under the FPA no 2010-0015; and in part by Intelligent Vision Austria. A. Cavallaro also acknowledges the support of the UK EPSRC project NCNR (EP/R02572X/1). 1,2 Omair Sarwar is with the Institute of Networked and Embedded Systems, Alpen-Adria-Universit¨ at Klagenfurt, Austria, and the Centre for Intelligent Sensing, Queen Mary University of London, United Kingdom omair.sarwar@aau.at 2 Andrea Cavallaro is with the Centre for Intelligent Sensing, Queen Mary University of London, United Kingdom a.cavallaro@qmul.ac.uk 1 Bernhard Rinner is with the Institute of Networked and Embedded Systems, Alpen-Adria-Universit¨ at Klagenfurt, Austria bernhard.rinner@aau.at (a) (b) (c) (d) Fig. 1: A face image de-identiﬁed (protected) with different privacy ﬁlters. (a) Original image (crop). Image protected with (b) pixelation, (c) hopping blur; and (d) proposed ﬁlter. A privacy ﬁlter can be static or dynamic. Static ﬁlters keep their parameters, such as the standard deviation of a Gaussian blur, spatially and temporally constant [1], [6]. Static ﬁlters protect against na¨ ıve attacks, but are prone to parrot [2] and reconstruction attacks, such as na¨ ıve-SR and parrot-SR. Dynamic ﬁlters change their parameters spatially and/or temporally [3], [8] and protect faces against parrot and reconstruction attacks. However, they may introduce ﬂicker. Flicker-reduction approaches were developed for video compression [9]–[14] and can be applied prior to, during or after encoding [11], [12]. Approaches to be applied prior to [15] and during encoding [10]–[12] are coder-speciﬁc. Approaches to be applied after encoding measure the spatio- temporal correlation between frames [9], [13], [14] and are generic. However, these approaches cannot be applied to our scenario as correlation is compromised by privacy ﬁlters that use scrambling [3] and warping [8]. Therefore, an alternative solution for minimising ﬂicker is needed. In this paper, we present a privacy-preserving ﬁlter for drone videos that addresses the trade-off between privacy, ﬁdelity and temporal smoothness. To the best of our knowl- edge, this is ﬁrst time that ﬂicker reduction is considered for a privacy ﬁlter. The proposed ﬁlter minimises spatio-temporal distortions and is robust against na¨ ıve, parrot, na¨ ıve-SR and parrot-SR attacks. Depending on the resolution of the cap- tured face, the parameters of an Adaptive Hopping Gaussian Mixture Model (AHGMM) ﬁlter are adjusted according to the target spatial distortion and are then mixed with decaying weights to minimise ﬂicker. II. PROBLEM DEFINITION We aim to robustly protect a face with minimal spatial and temporal distortions, and to prevent various identiﬁcation attacks. Let R m be a privacy-sensitive region, such as a face, in frame I m . Let ¯ R m be the corresponding privacy-protected region generated with ﬁlter F Ω * i , which uses as parameters Ω ∗ i ∈{Ω 0 , Ω 1 , Ω 2 ,...}. The larger the index, the stronger the distortion introduced in the privacy-protected region.