RAL: an Algebra for Querying RDF Flavius Frasincar (flaviusf@win.tue.nl), Geert-Jan Houben (houben@win.tue.nl), Richard Vdovjak (richardv@win.tue.nl) and Peter Barna (pbarna@win.tue.nl) Eindhoven University of Technology, Department of Mathematics and Computer Science, PO Box 513, NL-5600 MB Eindhoven, the Netherlands Abstract. To make the World Wide Web machine-understandable there is a strong demand both for languages describing metadata and for languages querying meta- data. The Resource Description Framework (RDF), a language proposed by W3C, can be used for describing metadata about (Web) resources. RDF Schema (RDFS) extends RDF by providing means for creating application specific vocabularies (on- tologies). While the two above languages are widely acknowledged as a standard means for describing Web metadata, a standardized language for querying RDF metadata is still an open issue. Research groups coming both from industry and academia are presently involved in proposing several RDF query languages. Due to the lack of an RDF algebra such query languages use APIs to describe their semantics and optimization issues are mostly neglected. This paper proposes RAL (an RDF algebra) as a reference mathematical study for RDF query languages and for performing RDF query optimization. We define the data model, we present the operators to manipulate the data, and we address the application of RAL for query optimization. RAL includes: extraction operators to retrieve the needed resources from the input RDF model, loop operators to support repetition, and construction operators to build the resulting RDF model. Keywords: RDF(S), RDF(S) query language, RDF(S) algebra 1. Introduction On the basis of its number of users and the attention it attracts, it is fair to say that the World Wide Web is the most popular source of information. With its overwhelming success and its considerable influence on the way in which we exchange information, some compare it to Gutenberg’s invention of the printing press. Computer applications make information available for a very diverse audience on different platforms worldwide and 24 hours a day. This first generation of the Web is strongly dependent on the human skills of the users. Computers support the users in providing them with the information on the screen or paper, but it is left to the users to read the information. In [7] Tim Berners-Lee describes a vision on a next-generation Web, called the Semantic Web (SW). In this vision he explains how the Web paradigm can be exploited in a context where computer applications deal with information. To make the information c 2004 Kluwer Academic Publishers. Printed in the Netherlands. r.tex; 13/01/2004; 23:57; p.1