A SPATIO-ANGULAR FILTER FOR HIGH QUALITY SPARSE LIGHT FIELD REFOCUSING Martin Alain, Aljosa Smolic V-SENSE Project, School of Computer Science and Statistics, Trinity College, Dublin ABSTRACT The ability to render synthetic depth-of-field effects post capture is a flagship application of light field imaging. However, it is known that many existing light field refocusing methods suffer from severe artefacts when applied to sparse light fields, known as angular aliasing. We propose in this paper a method for high quality sparse light field refocusing based on insights from depth- based bokeh rendering techniques. We first provide an in-depth analysis of the geometry of the defocus blur in light field refocus- ing by analogy with the defocus geometry in a traditional camera using the thin lens model. Based on this analysis, we propose a filter for removing angular aliasing artefacts in light field refo- cusing, which consists in modifying the well known shift-and- sum algorithm to apply a depth-dependent blur to the light field in between the shift and the sum operations. We show that our method can achieve significant quality improvements compared to existing approaches for a reasonable computational cost. Index Terms— Light field imaging, refocusing, angular aliasing, bokeh 1. INTRODUCTION Light field imaging allows to capture all light rays passing through a given amount of the 3D space [1, 2], especially captur- ing angular information which is lost in traditional 2D imaging systems. We focus in this paper on the common two-plane pa- rameterisation of light fields, in which the light field can be rep- resented as a 4D function: Ω × Π → R, (s, t, u, v) → p(s, t, u, v), where the plane Ω represents the spatial distribution of light rays, also called the image plane, indexed by (u, v), while Π, the cam- era plane, corresponds to their angular distribution indexed by (s, t). In practice, the light field parameterised with two-parallel planes consists in a regularly sampled 2D grid of 2D images. The regular grid spacing on the camera plane is called the baseline, denoted b, while the 2D images are named sub-aperture images (SAI). We consider in this paper the variables s, t, u, v to be met- ric, and define their corresponding scalar indices i, j, k, l, where i, j are camera indices and k, l are pixel indices. For convenience, we define L(i, j, k, l) p(s, t, u, v) and we denote the SAIs by I i,j (k, l) L(i, j, k, l). Applications of light fields notably include rendering novel images viewpoints [1, 3], estimating scene geometry in the form This publication has emanated from research conducted with the finan- cial support of Science Foundation Ireland (SFI) under the Grant Number 15/RP/2776. This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No. 780470. of disparity or depth maps [4–6], and synthetic depth-of-field rendering or refocusing [7, 8]. In this paper, we focus on the latter application, for which many methods have been proposed. The shift-and-sum algorithm [7, 9] is a simple and a well known method to produce refocused images from a light fields, in which the light field SAIs are first shifted towards the target focal plane and then averaged. An extension of this concept to the Fourier domain was later proposed in [10]. More advanced filters in the 4D Fourier domain have then been proposed to perform volumet- ric refocusing [8]. More recently, the Fourier Disparity Layer representation has been proposed [11], which allows rendering and refocusing in real time by exploiting parallelisation capabil- ities of modern GPUs. However, the light field refocusing methods cited above ex- hibit artefacts when applied to sparse light field inputs, known as angular aliasing. The formal definition of densely (and by op- position sparsely) sampled light fields is given in the study of plenoptic sampling [12–14]. In [12], Chai et al. first provided guidelines for dense light field sampling. By considering the dis- parity between neighbouring SAIs, the condition for having a densely sampled light field is that its disparity should not exceed 1. Such condition is difficult to respect in practice, and many ex- isting light field datasets are not strictly dense, in particular when captured with a gantry or a camera array. Therefore, multiple ap- proaches have been developed to address angular aliasing in light field refocusing. A direct approach consists in reconstructing a dense light field from the sparse input before refocusing [15]. In order to avoid having to reconstruct a full dense light field or perform any pre-processing of the light field, Xiao et al. [16] pro- posed a method to detect angular aliasing using a statistical anal- ysis of the refocused light field, and reduce the aliasing by using lower resolution version of the refocused image from a Gaus- sian pyramid, which are then fused with Poisson image editing techniques [17]. Wang at al. proposed to use depth-based bokeh rendering methods (discussed below) in order to avoid angular aliasing artefacts, which is also combined with super-resolution of the in-focus region to render the final image [18]. A learning based method was recently proposed in which the angular alias- ing filtering is considered as a denoising problem solved with a convolutional neural network [19]. Before light field refocusing, rendering synthetic bokeh has been a long standing application in computer graphics [20–22]. By analysing the geometry of the defocus blur in traditional cam- eras using the thin lens model, the radius of the circle of confu- sion (CoC), can be expressed depending on the aperture radius, the depth of the light point source, the depth of the focal plane, and the lens focal length. Given an all-in-focus input image, its corresponding depth map and the camera parameters, a synthetic