Quantifiers in XQuery Norman May, Sven Helmer, and Guido Moerkotte Universit¨ at Mannheim Mannheim, Germany norman, helmer, moer@pi3.informatik.uni-mannheim.de Abstract We present algebraic equivalences that allow to unnest nested algebraic expressions containing quantifiers for order-preserving algebraic operators. We illustrate how these equivalences can be applied successfully to unnest nested queries formulated in XQuery. Measurements illus- trate the performance gains possible by unnesting. 1. Introduction Quantification is a core feature of XQuery, the query lan- guage for XML data proposed by the W3C. 1 The keywords some and every are used to express existential quantifi- cation and universal quantification. Query optimization for queries containing quantification has been investigated in the relational and object-oriented context — see [2] for related work. Initially, unnesting tech- niques were performed on the source level. However, now researchers prefer to describe unnesting equivalences on the algebraic level, because such techniques remain valid inde- pendent of the query language as long as it can be translated into the algebra. In addition, correctness proofs can be pro- vided for the equivalences. Thus, we define our unnesting techniques on the algebraic level also. However, if the result’s order is relevant, the unnesting techniques from the object-oriented and relational context cannot be applied. We show how to unnest XML queries containing quantifiers. The paper is organized as follows. Section 2 briefly mo- tivates and defines our algebra. For every unnesting equiva- lence presented in Section 3, we show how the equivalences are applied and provide performance figures for each eval- uation plan. Section 4 concludes the paper. 2. Notation and Algebra Our algebra works on sequences of sets of variable bind- ings, i.e., sequences of unordered tuples where every at- 1 http://www.w3.org/XML/Query tribute corresponds to a variable. We allow nested tuples, i.e. the value of an attribute may be a sequence of tuples. Single tuples are constructed using the standard [·] brack- ets. Concatenation of tuples and functions is denoted by ◦. The set of attributes defined for an expression e is defined as A(e). The set of free variables of an expression e is defined as F (e). Projection of a tuple on a set of attributes A is denoted by | A . For an expression e 1 possibly containing free variables, and a tuple e 2 , we denote by e 1 (e 2 ) the result of evaluating e 1 where bindings of free variables are taken from variable bindings provided by e 2 . Of course, this requires A(e 2 ) ⊆ F (e 1 ). For a set of attributes we define the tuple constructor ⊥ A such that it returns a tuple with attributes in A initialized to NULL. For sequences e, we use α(e) to denote the first element of a sequence. We equate elements with single element se- quences. The function τ retrieves the tail of a sequence and ⊕ concatenates two sequences. We denote the empty se- quence by ǫ. Our algebra extends the SAL-Algebra [1] developed by Beeri and Tzaban. SAL is the order-preserving counterpart of the algebra used in [3] extended to handle semistructured data. We give definitions for all relevant algebraic opera- tors – including our new result constructing operator Ξ – to make the paper self-contained. In figure 1 we define the algebraic operators recursively on their input sequences. For unary operators, if the input sequence is empty, the output sequence is also empty. For binary operators, the output sequence is empty whenever the left operand represents an empty sequence. In order to avoid special cases during the translation of XQuery into the algebra, we use the special algebraic oper- ator () that returns a singleton sequence consisting of the empty tuple, i.e. a tuple with no attributes. We also define a duplicate eliminating projection Π D A . Besides the projection, it has similar semantics as the distinct-values function of XQuery: it does not pre- serve order. However, we require it to be deterministic and idempotent. When we want to eliminate the set of attributes