Temporally smooth privacy-protected airborne videos Omair Sarwar 1,2 , Andrea Cavallaro 2 and Bernhard Rinner 1 Abstract— Recreational videography from small drones can capture bystanders who may be uncomfortable about appearing in those videos. Existing privacy filters, such as scrambling and hopping blur, address this issue through de-identification but generate temporal distortions that manifest themselves as flicker. To address this problem, we present a robust spatio- temporal hopping blur filter that protects privacy through de-identification of face regions. The proposed filter is meant for on-board installation and produces temporally smooth and pleasant videos. We apply hopping blur to protect each frame against identification attacks, and minimise artefacts and flicker introduced by the hopping blur. We evaluate the proposed filter against different identification attacks and by assessing the quality of the resulting videos using a subjective test and objective measures. I. INTRODUCTION Recreational videography may capture faces, licence plates, windows of private houses and may therefore lead to discomfort or privacy concerns. To address this problem, privacy filters are used to modify the appearance of privacy- sensitive image regions [1]–[6]. For example, the appearance of a captured face can be modified in order to conceal the identity of the person (see Fig. 1). A privacy filter should cause only a minimal spatio- temporal distortion. However, filters such as scrambling [3] and hopping blur may generate abrupt changes in the inten- sity values of consecutive frames thus resulting in unpleasant flicker. A privacy-protected video should also prevent person identification under different attacks, such as na¨ ıve and parrot attacks. Na¨ ıve attacks compare unprotected gallery faces against privacy-filtered probe faces, whereas parrot attacks de-identify both gallery and probe faces. In addition to the above, na¨ ıve-SR attacks first restore (e.g. with super- resolution (SR) [7]) filtered probe faces and then compare them against unprotected gallery faces, and parrot-SR attacks filter both gallery and probe faces and restore them with super-resolution before comparing them against each other. O. Sarwar was supported by the Erasmus Mundus Joint Doctorate in Interactive and Cognitive Environment, funded by the Education, Audio- visual & Culture Executive Agency under the FPA no 2010-0015; and in part by Intelligent Vision Austria. A. Cavallaro also acknowledges the support of the UK EPSRC project NCNR (EP/R02572X/1). 1,2 Omair Sarwar is with the Institute of Networked and Embedded Systems, Alpen-Adria-Universit¨ at Klagenfurt, Austria, and the Centre for Intelligent Sensing, Queen Mary University of London, United Kingdom omair.sarwar@aau.at 2 Andrea Cavallaro is with the Centre for Intelligent Sensing, Queen Mary University of London, United Kingdom a.cavallaro@qmul.ac.uk 1 Bernhard Rinner is with the Institute of Networked and Embedded Systems, Alpen-Adria-Universit¨ at Klagenfurt, Austria bernhard.rinner@aau.at (a) (b) (c) (d) Fig. 1: A face image de-identified (protected) with different privacy filters. (a) Original image (crop). Image protected with (b) pixelation, (c) hopping blur; and (d) proposed filter. A privacy filter can be static or dynamic. Static filters keep their parameters, such as the standard deviation of a Gaussian blur, spatially and temporally constant [1], [6]. Static filters protect against na¨ ıve attacks, but are prone to parrot [2] and reconstruction attacks, such as na¨ ıve-SR and parrot-SR. Dynamic filters change their parameters spatially and/or temporally [3], [8] and protect faces against parrot and reconstruction attacks. However, they may introduce flicker. Flicker-reduction approaches were developed for video compression [9]–[14] and can be applied prior to, during or after encoding [11], [12]. Approaches to be applied prior to [15] and during encoding [10]–[12] are coder-specific. Approaches to be applied after encoding measure the spatio- temporal correlation between frames [9], [13], [14] and are generic. However, these approaches cannot be applied to our scenario as correlation is compromised by privacy filters that use scrambling [3] and warping [8]. Therefore, an alternative solution for minimising flicker is needed. In this paper, we present a privacy-preserving filter for drone videos that addresses the trade-off between privacy, fidelity and temporal smoothness. To the best of our knowl- edge, this is first time that flicker reduction is considered for a privacy filter. The proposed filter minimises spatio-temporal distortions and is robust against na¨ ıve, parrot, na¨ ıve-SR and parrot-SR attacks. Depending on the resolution of the cap- tured face, the parameters of an Adaptive Hopping Gaussian Mixture Model (AHGMM) filter are adjusted according to the target spatial distortion and are then mixed with decaying weights to minimise flicker. II. PROBLEM DEFINITION We aim to robustly protect a face with minimal spatial and temporal distortions, and to prevent various identification attacks. Let R m be a privacy-sensitive region, such as a face, in frame I m . Let ¯ R m be the corresponding privacy-protected region generated with filter F Ω * i , which uses as parameters Ω ∗ i ∈{Ω 0 , Ω 1 , Ω 2 ,...}. The larger the index, the stronger the distortion introduced in the privacy-protected region.