JIT happens: Transactional Graph Processing in Persistent Memory meets Just-In-Time Compilation Muhammad Attahir Jibril TU Ilmenau Germany muhammad-attahir.jibril@tu-ilmenau.de Alexander Baumstark TU Ilmenau Germany alexander.baumstark@tu-ilmenau.de Philipp Götze TU Ilmenau Germany philipp.goetze@tu-ilmenau.de Kai-Uwe Sattler TU Ilmenau Germany kus@tu-ilmenau.de ABSTRACT Graph databases are used for diferent applications like analyzing large networks, representing and querying knowledge graphs, and managing master data and complex data structures. Besides graph analytics, the transactional processing of concurrent up- dates and queries represents a challenging data management task. In this paper, we investigate the usage of persistent memory as a very promising technology for graph processing. We present a novel architecture for transactional processing of queries and up- dates on a property graph model that exploits and addresses the specifc characteristics of persistent memory by hybrid storage and memory management as well as a just-in-time query compila- tion approach. Our experimental evaluation on interactive short read and update query workloads show that PMem-based systems that are well-designed to exploit PMem characteristics outper- form traditional disk-based systems signifcantly and have only a small overhead compared to DRAM-only systems. Moreover, the evaluation shows that JIT compilation brings performance benefts especially when an adaptive compilation approach is leveraged to hide the overhead of compilation as well as the latency of PMem. 1 INTRODUCTION Graph databases represent an important class of NoSQL systems with numerous favors, including systems for analyzing large graphs, systems for querying knowledge bases, and systems sup- porting updates on graphs and navigational queries. They are designed for diferent graph data models ranging from RDF triples to property graph models, as well as diferent processing mod- els from database query processing to approaches like the bulk synchronous parallel (BSP) model. The numerous available systems mainly adopt the typical ar- chitectures of database systems, i.e., traditional disk-based archi- tecture, in-memory architecture or scalable, distributed solutions. Graph data are either stored in disk-based data structures and loaded into memory for processing or kept directly in in-memory structures (without requiring to load data during startup) while using techniques like logging to allow for persistent updates. In this work, we present a novel architecture for graph databases based on persistent memory (PMem). PMem ś also known as non-volatile memory (NVM) or storage-class mem- ory (SCM) ś is one of the most promising trends in hardware © 2021 Copyright held by the owner/author(s). Published in Proceedings of the 24th International Conference on Extending Database Technology (EDBT), March 23-26, 2021, ISBN 978-3-89318-084-4 on OpenProceedings.org. Distribution of this paper is permitted under the terms of the Creative Commons license CC-by-nc-nd 4.0. development which have the potential to hugely impact database system architectures. Characteristics such as byte-addressability, read latency close to DRAM but with read-write asymmetry, and inherent persistence open up new opportunities for database sys- tems. Specifcally, Intel’s Optane DC Persistent Memory Modules (DCPMMs) are already available on the market and supported by the Persistent Memory Development Kit (PMDK) [17]. Several studies, as well as our experiments, have identifed the following characteristics of this technology (we elaborate these in more detail in Section 3): (C1) PMem has a higher latency and lower bandwidth than DRAM. (C2) Reads and writes on PMem behave asymmetrically. (C3) DCPMMs internally work on 256-byte blocks. (C4) Failure atomicity is only guaranteed for 8-byte aligned writes. The focus of our work is an architecture for hybrid transactional/ analytical processing (HTAP) on a property graph model. Trans- action support covers insert/update/delete operations on nodes, relationships, and their properties with ACID guarantees. Fur- thermore, we support Cypher-like navigational queries. In this paper, we particularly focus on data structures and techniques for query and transaction processing in graph databases exploiting PMem and addressing the characteristics (C1)-(C4) mentioned above. Although we aim for HTAP, we do not consider graph analytics in this paper yet. Exploiting PMem for graph analytics is discussed by other researchers, e.g., in [13]. Our contributions are as follows: • We present the architecture of an HTAP graph engine with storage structures designed for PMem, primarily taking (C1)-(C3) into account. • We discuss the implementation of a timestamp ordering- based multiversion concurrency control (MVTO) protocol optimized for PMem addressing (C4). • We describe our just-in-time (JIT) query compilation ap- proach for compiling graph queries into machine code to hide the higher latency of PMem as described in (C1). Thus, the novelty of our work lies in the design, adaptation as well as evaluation of transaction and query processing techniques to leverage the idiosyncrasies of persistent memory for graph databases. 2 RELATED WORK Several of the approaches presented in this paper are based on insights from previous work. In particular, the lessons learned regarding the new concepts of data structures for PMem had a Series ISSN: 2367-2005 37 10.5441/002/edbt.2021.05