International Journal of Computer Applications (0975 8887) Volume 168 No.5, June 2017 14 JSON Integration in Relational Database Systems Dušan Petković University of Applied Sciences Hochschulstr. 1, Rosenheim, 83024, Germany ABSTRACT Recently, a new era of application development is emerging, which us based upon big data technology and the ease of access to compute resources, such as mobile devices. All these issues can be better supported using JSON (and JavaScript) technology. Almost all relational database systems have integrated JSON, partly according to the specification given in the ANSI SQL standard and partly according to other specifications. In this article we discuss several JSON features and investigate how different relational database systems (RDBMSs) have integrated them. Of all database systems discussed in this paper Oracle has implemented the most concepts specified in the ANSI SQL/JSON standard. In contrast to Oracle, PostgreSQL have not implemented any standardized features. Also, we discuss conformance of all these implementations in relation to the ANSI SQL/JSON standard and give suggestions, which important features should be implemented in the future releases of the RDBMSs. Keywords JSON, SQL/JSON, relational database systems, JSON integration 1. INTRODUCTION JSON (JavaScript Object Notation) is a simple data format used for data interchange. The structure of JSON content follows the syntax structure for JavaScript. The following example: {"info": {"who": "Fred" ,"where": "BBC", "friends":[{"name": "Lili", "rank":5},{"name": "Hank", "rank": 7}]}} shows a JSON string called info that describes a single person, Fred, his affiliation, BBC, and his friends, Lili and Hank. Generally, a JSON string contains either an array of values or an object, which is an array of name/value pairs. An array is surrounded by a pair of square brackets and contains a comma-separated list of values. An object is surrounded by a pair of curly brackets and contains a comma-separated list of name/value pairs. A name/value pair consists of a field name (in double quotes), followed by a colon (:), followed by the field value (in double quotes). 1.1 Why Support JSON in RDBMSs? There are several reasons why it is necessary to integrate JSON in relational database systems: Storage of semi-structured data Databases provide reduced administrative costs Increased developer productivity Relational tables contain structured data. The advantage of JSON is that it can contain both structured and semi- structured data. By supporting storage of JSON objects, a relational database extends its capabilities and integrates structured and semi-structured data together. When JSON objects are stored individually and are used for separate programs, each program has to administrate its own data. In case of JSON support through a relational database system, the system administers all data. The same is true for security and transaction management, because the system takes over the management of security and transaction processing, meaning that this functionality does not need to be implemented in users’ programs. Finally, integrating JSON in a RDBMS increases productivity, because the system takes over a lot of tasks that otherwise must be implemented programmers in their programs. 1.1.1 Integration Features Discussed For all RDBMSs mentioned in this article, the following features will be discussed: Storing JSON documents in RDBMSs Presentation of JSON documents Indexing techniques for JSON documents Querying JSON documents 1.1.1.1. Storing JSON in RDBMSs Generally, there are three different ways in which data presented in a particular format can be stored in relational form: As raw documents Decomposed into relational columns Using native storage In case that JSON documents are stored using either VARCHAR, CLOB or BLOB data type, an exact copy of the data is stored. In this case, JSON documents are stored “raw”—that is, in their character string form. The raw form allows insertion of JSON documents in an easy way. The retrieval of such a document is very efficient if the entire document is retrieved. To retrieve parts of the documents, special types of indices are required. To decompose a document into separate columns of one or more tables, its schema is used. In this case, the hierarchical structure of the document is preserved, while order among elements is ignored. Note that storing JSON documents in decomposed form can be applied in rare cases, where the corresponding schema exists. Native storage means that JSON documents are stored in their parsed form. In other words, the document is stored in an internal representation that preserves the content of the data. Using native storage makes it easy to query information based on the structure of the document. On the other hand, reconstructing the original form of the document is difficult, because the created content may not be an exact copy of the document. 1.1.1.2. Presentation of JSON There are two different issues concerning presentation of JSON: JSON documents can be projected in relational form and relational data stored in a table can be published as JSON documents. One common reason for projecting JSON documents in relational form is that existing legacy applications, packaged