Querying incomplete graphs with data Ga¨ elle Fontaine 1 and Am´ elie Gheerbrant 2 1 Department of Computer Science, Universidad de Chile 2 LIAFA (Universit´ e Paris Diderot - Paris 7 & CNRS) 1 Introduction Graph databases underlie several modern applications such as social networks and the Semantic Web. In those scenarios, integrating and exchanging data is very common, which leads to proliferation of incomplete graph data. However, the well developed models of incompleteness of data do not apply to graph data. This is mainly due to the fact that standard graph query languages concentrate on graph topology; this requires functionalities beyond the abilities of standard relational systems. Besides, many graph languages ignore the actual data stored. However, recently languages combining data and topology aspects of querying have been proposed for graph databases. An example is a query Find pairs of people in a social network connected by professional links restricted to people of the same age ). Formalisms developed to handle such queries include regular expressions with memory (REM), regular expressions with equalities (REE) [5], their extensions [1], as well as variants of XPath [4]. Handling incompleteness by languages dealing with pure graph topology has been studied in [2]. In this short note, we present preliminary results on dealing with incompleteness at the levels of both data and topology, using some of the recently proposed query languages. 2 Preliminaries Let Σ be a ﬁnite alphabet Σ, let N be a countable set of node ids and let D be an inﬁnite alphabet of data values. A data graph G (over Σ, N and D) is a tuple (V,E,ρ), where V ⊆N is a ﬁnite set of nodes, E ⊆ V × Σ × V is a set of Σ-labeled edges, and ρ : V →D assigns a data item to each node. A path π in G is a sequence v 0 a 0 v 1 a 1 v 2 ··· v k−1 a k−1 v k such that (v i−1 ,a i−1 ,v i ) ∈ E, for each i ≤ k. The data path associated with π is the word ρ(v 0 )a 0 ρ(v 1 )a 1 ··· a k−1 ρ(v k ). The regular expressions with equality (REE) [5] are deﬁned by the grammar e ::= ǫ | a | e.e | e ∪ e | e + || e = | e = , where a ranges over Σ. Given a REE e, L(e) is deﬁned by induction by the following rules L(ǫ)= {d : d ∈ D}, L(e = )= {d 1 a 1 ...d n1 a n−1 d n ∈ L(e): d 1 = d n } L(a)= {dad ′ : d, d ′ ∈ D}, L(e = )= {d 1 a 1 ...d n1 a n−1 d n ∈ L(e): d 1 = d n },