Accelerating XML Query Matching through Custom Stack Generation on FPGAs Roger Moussalli, Mariam Salloum, Walid Najjar, and Vassilis Tsotras Department of Computer Science and Engineering University of California, Riverside CA 92521, USA {rmous,msalloum,najjar,tsotras}@cs.ucr.edu http://www.cs.ucr.edu Abstract. Publish-subscribe systems present the state of the art in in- formation dissemination to multiple users. Such systems have evolved from simple topic-based to the current XML-enabled systems. Here, users pose complex queries (expressed in XPath) on the structure and content of the streaming documents. The parts of the documents that match the user queries are then returned to the users. This paper proposes a novel hardware architecture that would exploit the parallelism found in XPath filtering systems. Using an incoming XML stream, parsing and matching with thousands of user profiles are performed simultaneously on a single FPGA, thus yielding up to three orders of magnitude higher throughput when compared to conventional approaches bound by the se- quential aspect of software computing. By converting XPath expressions into custom stacks, our architecture is the first providing full support for all structural XPath constructs, including parent-child and ancestor descendant relations, whilst allowing wildcarding and recursion. Keywords: FPGA, XML, Query, XPath, Compilation. 1 Introduction Increased demand for timely and accurate event-notification systems has lead to the wide adoption of Publish/Subscribe Systems(or simply pub-sub). A pub-sub is an asynchronous event-based dissemination system which consists of three components: publishers, who feed a stream of messages into the system, sub- scribers, who post their interests (also called profiles ), and an infrastructure for matching subscriber interests with published messages and delivering matched messages to the interested subscriber. Pub-sub systems have enabled notification services for users interested in receiving news updates, stock prices, weather up- dates, etc; examples include google.news.com, pipes.yahoo.com, and www.ticket- master.com. Pub-sub systems have greatly evolved over time, adding further challenges and opportunities in their design and implementation. Earlier pub- subs involved simple topic-based communication. That is, subscribers could subscribe to a predefined collection of topics (e.g., news, weather, etc.). The Y.N. Patt et al. (Eds.): HiPEAC 2010, LNCS 5952, pp. 141–155, 2010. c Springer-Verlag Berlin Heidelberg 2010 Int. Conf. on High-Performance Embedded Architectures and Compilers (HiPEAC), January 2010, Pisa, Italy