Data structures for temporal graphs based on compact sequence representations $ Diego Caro a,n , M. Andrea Rodríguez a , Nieves R. Brisaboa b a Universidad de Concepción, Chile b Universidade da Coruña, Spain article info Article history: Received 18 November 2013 Received in revised form 10 November 2014 Accepted 9 February 2015 Recommended by: Xifeng Yan Available online 17 February 2015 Keywords: Temporal graphs Compact data structures Wavelet tree abstract Temporal graphs represent vertices and binary relations that change along time. In this paper, a temporal graph is conceptualized as the sequences of changes on its edges during its lifetime, also known as temporal adjacency logs. The paper explores the use of compression techniques, and compact and self-indexed data structures, to represent large temporal graphs. More specifically, we present four strategies to represent temporal graphs. The first two strategies, Time-interval Log per Edge (EdgeLog) and the Adjacency Log of Events (EveLog), use compression techniques over the inverted indexes that represent the adjacency logs. Then, we introduce two new strategies to represent temporal graphs using compact and self-indexed data structures. Compact Adjacency Sequence (CAS) represents changes on adjacent vertices as a sequence stored in a Wavelet Tree, and the Compact Events ordered by Time (CET) represents the edges that change in each time instant using Interleaved Wavelet Tree, a new compact and self- indexed data structure specifically designed in this work that is able to represent a sequence of multidimensional symbols (that is, tuples of symbols encoded together). We experimentally evaluate the four strategies and compare them with previous alternatives in the state-of-the-art showing that the four alternatives can represent large temporal graphs making efficient use of space, while keeping good time performance for a wide range of useful queries. We conclude that the use of compression techniques or the use of compact and self-indexed data structures open the possibility for the design of interesting representations of temporal graphs that fit the needs of different application domains. & 2015 Elsevier Ltd. All rights reserved. 1. Introduction Temporal graphs model real networks that exhibit a dynamic behavior where the interactions between elements of the network change over time. For example, consider an online social network where friends are added or removed along time, or a network of mobile communications where connections represent calls between mobile phones. Taking into account the temporal dynamism of graphs allows us to exploit information about temporal correlations and caus- ality, which would be unfeasible through a static (or classical) analysis [1,2]. Classical measures over static Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/infosys Information Systems http://dx.doi.org/10.1016/j.is.2015.02.002 0306-4379/& 2015 Elsevier Ltd. All rights reserved. ☆ Diego Caro and M. Andrea Rodríguez were funded by Fondef D09I1185. Diego Caro is supported by a CONICYT scholarship for PhD. M. Andrea Rodríguez is funded by Fondecyt 1140428. Nieves Brisaboa is funded by MICINN (PGE and FEDER) Grants TIN2009-14560-C03-02, TIN2010-21246- C02-01 and CDTI CEN-20091048, and by Xunta de Galicia (co-funded with FEDER) ref. 2010/17. We would also like to thank to Diego Seco and José Fuentes for their help in the preliminary discussions of the structures, to Guillermo de Bernardo for his help providing all the Ks implementations, and to Claudio Sanhueza from Yahoo! Labs, who helps us with the Flickr dataset. n Corresponding author. Tel.: þ56 41 2204319; fax: þ56 41 2221770. E-mail addresses: diegocaro@udec.cl (D. Caro), andrea@udec.cl (M. Andrea Rodríguez), brisaboa@udc.es (N.R. Brisaboa). Information Systems 51 (2015) 1–26