Enhanced Segment Trees In Object-Relational Mapping Michal Gawarkiewicz Faculty of Mathematics and Computer Science Nicolaus Copernicus University Toru ´ n, Poland garfi@mat.umk.pl Piotr Wi´ sniewski Faculty of Mathematics and Computer Science Nicolaus Copernicus University Toru ´ n, Poland pikonrad@mat.umk.pl Krzysztof Stencel Institute of Informatics University of Warsaw Warsaw, Poland stencel@mimuw.edu.pl ABSTRACT Tree-shaped data often occur in business applications, e.g. a corporate hierarchy or a categorization of products. A nat- ural class of analytic queries posed to such data consists of aggregate queries over subtrees. Evaluation of such queries in large data sets requires significant amount of time. In this paper we focus on dedicated data structures that mate- rialize partial results of such queries in a form of well-known segment trees. In a multiprogramming environment such data structures require careful implementation. A na¨ ıve de- sign is going to suffer from synchronization problems. The root of such a structure will be updated by each transaction that changes anything down its subtree. We propose ring updates that allow using the presented data structure with multiple execution threads. Our implementation is designed to work with object-relational mapping systems. If an ap- plication uses stored hierarchical data, its designer can add annotations to augment mapped database objects with ma- terialization of partial aggregations over subtrees. Mapping generators create all necessary storage objects and triggers. We describe our proof-of-concept prototype implementation of this feature in Hibernate. We also present an experimen- tal evaluation of this prototype’s performance. The results confirm that the proposed materializations notably boost the evaluation of analytical queries over hierarchies. Categories and Subject Descriptors H.2.6 [Database Management]: Middleware for databases— Object-relational mapping facilities General Terms Performance Keywords materialized views, analytical queries, hierarchical data, object- relational mapping Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. BCI’13 September 19-21, 2013, Thessaloniki, Greece. Copyright 2013 ACM 978-1-4503-1851-8/13/09 ...$15.00. 1. INTRODUCTION The architecture of an applications usually is a collection of trade-offs. On one hand, clear architectures facilitate de- velopment and maintenance. They also reduce the cost of these activities. On the other hand, they may hinder the performance. One of possible solutions to this problems is the introduction of additional layers. Materialized views and object-relational mapping systems are examples of such lay- ers. In this paper we analyze a class of solutions based on these observations. Object relational mappers (ORM) are still underutilized. Moreover, they are often recognized as a performance hazard. In spite of the common perception that ORMs just provide the galvanic mapping between ob- jects and rows, they also constitute a layer of middleware. This layer can be used to conceal a plethora of performance solutions that (1) significantly reduce the response time of an application and (2) are not visible for application pro- grammers. In prequel papers [1, 2, 3] we integrated recursive queries into ORMs. We showed that the speed of query process- ing against recursive structures significantly increased. We also enhanced ORMs with solutions to materialize partial aggregation [4]. They allow fast aggregate query processing without obscuring the architecture of an application. In this paper we propose solutions to cater for notewor- thy trickier application needs. Assume a dimension table that is organized as a hierarchy, e.g. the employee table with subordinate-manager many-to-one relationship. The fact table contains sales data. Each sale is connected to one employee. This database is frequently asked queries for total sales of a given employee and all his/her subordinates. Such queries are needed e.g. in companies that perform multi-level marketing. We propose to accelerate such queries using materialized data structures similar to segment trees. We present en- hanced segment trees that are (1) well-suited for any trees (possibly non-binary) and (2) efficient in a multithreaded execution environment. The root of a segment tree has to be updated by each transaction that modifies anything be- low. The lock contention and possible deadlocks caused by the na¨ ıve solution is not acceptable by any application. The method proposed in this paper solves such synchronization problems. We use the object-relational mapping layer to conceal all peculiarities of the solution. Generators built into ORM care for creating appropriate materializations in the database and for synchronization of concurrent updates. 122