INSTANS Comparison with C-SPARQL on Close Friends Simulation Report Mikko Rinne, Haris Abdullah, Seppo Törmä, Esko Nuutila Department of Computer Science and Engineering Aalto University, School of Science Konemiehentie 2, Espoo, Finland {firstname.lastname}@aalto.fi ABSTRACT This document summarizes the results of empirical compar- isons between Instans and C-SPARQL executed in March 2012. Although the approaches in the two systems are fun- damentally different, the comparison is seen to have value in illustrating and exemplifying those differences and doc- umenting, how those differences manifest themselves when compared on an actual application. Keywords Event Processing, RDF, SPARQL Update, Streaming Data, C-SPARQL, INSTANS 1. INTRODUCTION We have implemented a system for processing standing queries denoted Instans 1 . In our test system SPARQL queries are parsed into a Rete network [4]. A fundamental principle of such incremental matchers is to have the complete set of rules actively running in the system simultaneously. This ap- proach enables a group of SPARQL queries to work together as a “program”. The system does not “execute queries” on demand but the Rete network reacts to changes in inputs, producing an output when the pre-programmed conditions are matched. In this study we compare different ways of using SPARQL for processing queries on flexible and heterogeneous event streams. The example has been published in [6], but the ma- jority of the results were left to this document due to space restrictions. Event processing terms used in this document follow the conventions of the Event Processing Glossary v. 2.0. [5] 2. C-SPARQL 1 Incremental eNgine for STANding Sparql, http://cse.aalto.fi/instans/ In C-SPARQL [1, 2, 3] RDF streams are built based on time- annotated triples. C-SPARQL provides a window mecha- nism, where the window size can be defined based on either time or the number of triples to be held in each window. Ag- gregation operators COUNT, MAX, MIN, SUM and AVG can be used to compute aggregate values from the windows. There is also a timestamp() function to access the times- tamp of a variable. Aggregation operators were added to SPARQL Query 2 in version 1.1, but the query repetition and windowing environment remains a C-SPARQL special- ity. In light of the presented examples C-SPARQL focuses on relatively homogeneous event streams, where each event is represented by a single triple, annotated by time. In this study we consider the suitability of C-SPARQL to process heterogeneous events, and what would be the benefits and challenges of the approach in such an environment. C-SPARQL extends SPARQL query language to support stream processing. The possibility to automatically update graphs using SPARQL Update brings completely new pos- sibilities to use SPARQL for event processing. To probe the limits of SPARQL 1.1 Query and Update in the event processing domain, our approach has been to avoid any non- standard extensions. 3. “CLOSE FRIENDS” In our example application registered users of the service (or imported users from another service) form a social network. Every subscriber has a mobile terminal capable of register- ing location and transmitting location updates as a stream. All subscribers expect to be notified, when a friend regis- ters nearby. Using terms defined in the Event Processing Glossary [5], the architecture of the “Close Friends” system is shown in Figure 1, covering the following components: 1. Configuration: An RDF Store of semi-static informa- tion. In our example this includes the foaf:knows relation- ships of all subscribers. This data is required by the event processing agent to successfully process the streams. The configuration information can be obtained from end users subscribing via a web interface, possibly with their mobile clients. 2. Event Producers: Dynamic event data presented as Turtle-encoded RDF streams. The model expects multiple mobile clients. 2 http://www.w3.org/TR/sparql11-query/