1 TritanDB: Time-series Rapid Internet of Things Analytics EUGENE SIOW, THANASSIS TIROPANIS, XIN WANG, and WENDY HALL, University of Southampton e efficient management of data is an important prerequisite for realising the potential of the Internet of ings (IoT). Two issues given the large volume of structured time-series IoT data are, addressing the difficulties of data integration between heterogeneous ings and improving ingestion and query performance across databases on both resource-constrained ings and in the cloud. In this paper, we examine the structure of public IoT data and discover that the majority exhibit unique flat, wide and numerical characteristics with a mix of evenly and unevenly-spaced time-series. We investigate the advances in time-series databases for telemetry data and combine these findings with microbenchmarks to determine the best compression techniques and storage data structures to inform the design of a novel solution optimised for IoT data. A query translation method with low overhead even on resource-constrained ings allows us to utilise rich data models like the Resource Description Framework (RDF) for interoperability and data integration on top of the optimised storage. Our solution, TritanDB, shows an order of magnitude performance improvement across both ings and cloud hardware on many state-of-the-art databases within IoT scenarios. Finally, we describe how TritanDB supports various analyses of IoT time-series data like forecasting. CCS Concepts: •Information systems →Temporal data; Resource Description Framework (RDF); Database query processing; •Networks →Cyber-physical networks; •eory of computation →Data com- pression; Keywords: Internet of ings, Linked Data, Time-series data, ery Translation 1 INTRODUCTION e rise of the Internet of ings (IoT) brings with it new requirements for data management systems. Large volumes of sensor data form streams of time-series input to IoT platforms that need to be integrated and stored. IoT applications that seek to provide value in real-time across a variety of domains need to retrieve, process and analyse this data quickly. Hence, data management systems for the IoT should support the collection, integration and analysis of time-series data. Performance and interoperability for such systems are two pressing issues explored in this paper. Given the large volume of streaming IoT data coupled with the emergence of Edge and Fog Computing networks [13] that distribute computing and storage functions along a cloud-to- thing continuum in the IoT, there is a case for investigating the specific characteristics of IoT data to optimise databases, both on resource-constrained ings as well as dynamically-provisioned, elastically-scalable cloud instances, to beer store and query IoT data. e difficulties in data integration between heterogeneous IoT ings, possibly from different vendors, different industries and conforming to specifications from different standard bodies also drives our search for a rich data model, that encourages interoperability, to describe and integrate IoT data, which can then be applied to databases with minimal impact on performance. e Big Data era has driven advances in data management and processing technology with new databases emerging for many specialised use cases. Telemetry data from DevOps performance monitoring scenarios of web-scale systems has pushed time-series databases to the forefront again. IoT data is a new frontier, a potentially larger source of time-series data given it ubiquitous nature, with data that exhibits its own unique set of characteristics. Hence, it follows that by investigating arXiv:1801.07947v1 [cs.DB] 24 Jan 2018