Sentinel2GlobalLULC: A deep-learning-ready 1 Sentinel-2 RGB image dataset for global land 2 use/cover mapping 3 Yassir Benhammou 1,2,* , Domingo Alcaraz-Segura 3,4,5* , Emilio Guirado 5,6* , Rohaifa 4 Khaldi 3 , Boujem ˆ aa Achchab 2 , Francisco Herrera 1 , and Siham Tabik 1* 5 1 Department of Computer Science and Artificial Intelligence, Andalusian Research Institute in Data Science and 6 Computational Intelligence, DaSCI, University of Granada, 18071, Granada, Spain 7 2 Systems Analysis and Modeling for Decision Support Laboratory, Higher National School of Applied Sciences of 8 Berrechid, Hassan 1st University, Berrechid 218, Morocco 9 3 Department of Botany, Faculty of Science, University of Granada, 18071 Granada, Spain 10 4 iEcolab, Inter-University Institute for Earth System Research, University of Granada, 18006 Granada, Spain 11 5 Andalusian Center for Assessment and Monitoring of Global Change (CAESCG), University of Almer´ ıa, 04120 12 Almer´ ıa, Spain 13 6 Multidisciplinary Institute for Environment Studies “Ramon Margalef”, University of Alicante, San Vicente del 14 Raspeig, 03690 Alicante, Spain 15 * corresponding authors: Yassir Benhammou(benhammou@correo.ugr.es), Domingo 16 Alcaraz-Segura(dalcaraz@ugr.es),Emilio Guirado(e.guirado@ual.es), Siham Tabik(siham@ugr.es) 17 ABSTRACT 18 Land-Use and Land-Cover (LULC) mapping is relevant for many applications, from Earth system and climate modelling to territorial and urban planning. Global LULC products are continuously developing as remote sensing data and methods grow. However, there is still low consistency among LULC products due to low accuracy for some regions and LULC types. Here, we introduce Sentinel2GlobalLULC, a Sentinel-2 RGB image dataset, built from the consensus of 15 global LULC maps available in Google Earth Engine. Sentinel2GlobalLULC v1.1 contains 195572 RGB images organized into 29 global LULC mapping classes. Each image is a tile that has 224 × 224 pixels at 10 × 10 m spatial resolution and was built as a cloud-free composite from all Sentinel-2 images acquired between June 2015 and October 2020. Metadata includes a unique LULC type annotation per image, together with level of consensus, reverse geo-referencing, and global human modification index. Sentinel2GlobalLULC is optimized for the state-of-the-art Deep Learning models to provide a new gate towards building precise and robust global or regional LULC maps. 19 1 Background & Summary 20 Land-Use and Land-Cover mapping aims to comprise the continuous biophysical properties of the Earth surface into synthetic 21 categorical classes of natural or human origin, such as forests, shrublands, grasslands, marshlands, croplands, urban areas 22 or water bodies 1 . High resolution LULC mapping plays a key role in many fields, from natural resources monitoring, to 23 biodiversity conservation, urban planning, agricultural management or climate and earth system modelling 24 . Multiple 24 LULC products have been derived using satellite information at the global scale (Table 2), contributing to a better monitoring 25 and understanding of our planet 5, 6 . However, despite the acceptable accuracy of each individual product, a considerable 26 disagreement between products has been reported 4, 722 . There are several methodological reasons behind this problem: 27 Different satellite sensors with different spatial resolutions were used in each product, so the difference in precision from 28 coarse to fine resolution partially determines the final quality of each product. 29 Different pre-processing techniques, like atmospheric corrections, cloud removal and image composition were used in 30 each LULC product. 31 Each LULC product has a different temporal updating rate, some are regularly updated, whereas others have never been 32 updated. 33 . CC-BY-NC-ND 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted December 9, 2021. ; https://doi.org/10.1101/2021.12.01.470768 doi: bioRxiv preprint