https://doi.org/10.1007/s10707-020-00415-w JS4Geo: a canonical JSON Schema for geographic data suitable to NoSQL databases Angelo A. Frozza 1 · Ronaldo dos S. Mello 2 Received: 1 September 2019 / Revised: 28 April 2020 / Accepted: 18 May 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020 Abstract The large volume and variety of data produced in the current Big Data era lead companies to seek solutions for the efficient data management. Within this context, NoSQL databases rise as a better alternative to the traditional relational databases, mainly in terms of scal- ability and availability of data. A usual feature of NoSQL databases is to be schemaless, i.e., they do not impose a schema or have a flexible schema. This is interesting for sys- tems that deal with complex data, such as GIS. However, the lack of a schema becomes a problem when applications need to perform processes such as data validation, data inte- gration, or data interoperability, as there is no pattern for schema representation in NoSQL databases. On the other hand, the JSON language stands out as a standard for represent- ing and exchanging data in document NoSQL databases, and JSON Schema is a schema representation language for JSON documents that it is also leading to become a standard. However, it does not include spatial data types. From this limitation, this paper proposes an extension to JSON Schema, called JS4Geo, that allows the definition of schemas for geo- graphic data. We demonstrate that JS4Geo is able to represent schemas of any NoSQL data model, as well as other standards for geographic data, like GML and KML. We also present a case study that shows how a data integration system can benefit of JS4Geo to define local schemas for geographic datasets and generate an integrated global schema. Keywords Geographic data · NoSQL · JSON · JSON Schema · GeoJSON · JS4Geo 1 Introduction We live today in the so-called Big Data era, where large volumes of digital data are produced at a very high speed, stored in a distributed way and shared in different formats [21]. In this Angelo A. Frozza angelo.frozza@ifc.edu.br Ronaldo dos S. Mello r.mello@ufsc.br 1 Instituto Federal Catarinense - IFC, Santa Catarina, Brazil 2 Universidade Federal de Santa Catarina - UFSC, Florian´ opolis, Brazil Published online: June 2020 27 Geoinformatica (2020) 24:987–1019