remote sensing Communication Model Specialization for the Use of ESRGAN on Satellite and Airborne Imagery Étienne Clabaut 1, *, Myriam Lemelin 1 , Mickaël Germain 1 , Yacine Bouroubi 1 and Tony St-Pierre 2   Citation: Clabaut, É.; Lemelin, M.; Germain, M.; Bouroubi, Y.; St-Pierre, T. Model Specialization for the Use of ESRGAN on Satellite and Airborne Imagery. Remote Sens. 2021, 13, 4044. https://doi.org/10.3390/rs13204044 Academic Editor: Tania Stathaki Received: 3 September 2021 Accepted: 5 October 2021 Published: 10 October 2021 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil- iations. Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/). 1 Département de Géomatique Appliquée, Université de Sherbrooke, Sherbrooke, QC J1K 2R1, Canada; myriam.lemelin@usherbrooke.ca (M.L.); mickael.germain@usherbrooke.ca (M.G.); yacine.bouroubi@usherbrooke.ca (Y.B.) 2 XEOS Imaging Inc., Québec, QC G1P 4P5, Canada; tony.stpierre@xeosimaging.com * Correspondence: etienne.clabaut@usherbrooke.ca Abstract: Training a deep learning model requires highly variable data to permit reasonable gen- eralization. If the variability in the data about to be processed is low, the interest in obtaining this generalization seems limited. Yet, it could prove interesting to specialize the model with respect to a particular theme. The use of enhanced super-resolution generative adversarial networks (ERSGAN), a specific type of deep learning architecture, allows the spatial resolution of remote sensing images to be increased by “hallucinating” non-existent details. In this study, we show that ESRGAN create better quality images when trained on thematically classified images than when trained on a wide variety of examples. All things being equal, we further show that the algorithm performs better on some themes than it does on others. Texture analysis shows that these performances are correlated with the inverse difference moment and entropy of the images. Keywords: super-resolution; ESRGAN; generative adversarial networks; Haralick 1. Introduction Images of high (HR, ~1–5 m per pixel) and very high (VHR, <1 m per pixel) spatial resolution are of particular importance for several Earth observation (EO) applications, such as for both visual and automatic information extraction [13]. However, currently, most high-resolution and all very high-resolution images acquired by orbital sensors need to be purchased at a high price. On the other hand, there is abundant medium-resolution imagery currently available for free (e.g., the multispectral instrument onboard Sentinel-2 and the operational land imager onboard Landsat-8). Improving the spatial resolution of medium-resolution imagery to the spatial resolution of high- and very high-resolution imagery would thus be highly useful in a variety of applications. Image resolution enhancement is called super-resolution (SR) and is currently a very active research topic in EO image analysis [46] and computer vision in general, as shown in [7]. However, SR is inherently an ill-posed problem [8]. Multi-frame super-resolution (MFSR) uses multiple low-resolution (LR) images to constrain the reconstruction of a high-resolution (HR) image. However, this approach cannot be used when a single image is available. Single image super-resolution (SISR) is a particular type of SR that involves increasing the resolution of a low-resolution (LR) image to create a high-resolution (HR) image. SISR can be achieved by (1) the “external example-based” approach, where the algorithm learns from dictionaries [9], or by using (2) convolutional neural networks (CNNs), where the algorithm “learns” the relevant features of the image that would be useful for improving its resolution [10,11]. SISR can also be achieved by using (3) generative adversarial neural networks (GANs) [12]. GANs oppose two networks (a generator and a discriminator), one against the other, in an adversarial way. The generator is trained to produce new images to trick the discriminator into trying to distinguish whether it is a real image or a fake image. In this type of network, the generator and the discriminator act Remote Sens. 2021, 13, 4044. https://doi.org/10.3390/rs13204044 https://www.mdpi.com/journal/remotesensing