PUBSUB: An Efficient Publish/Subscribe System Tania Banerjee Mishra, Sartaj Sahni Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611 {tmishra, sahni}@cise.ufl.edu ABSTRACT PUBSUB is a versatile, efficient, and scalable publish/subscribe sys- tem. This paper describes the architecture of PUBSUB together with some of its current capabilities. A version of PUBSUB op- timized for event processing was benchmarked against the pub- lish/subscribe systems BE-Tree and Siena, which also are opti- mized for event processing. PUBSUB processes events faster than Siena and BE-tree. On our tests, the speedup of the fastest version of PUBSUB relative to Siena was 98% on an average. The speedup range relative to BE-Tree was from 1.23 to 1.48 and averaged 1.36 on the uniform tests and PUBSUB was comparable to BE-tree on the Zipf tests. The faster times in PUBSUB were a result of very efficient data structures used in PUBSUB to store the subscriptions, and the fast matching algorithms developed to match events to sub- scriptions. Keywords Content based publish/subscribe, Boolean expressions, efficient sub- scription matching 1. INTRODUCTION A publish/subscribe (pub/sub) system maintains a database of subscriptions, where each subscription is a Boolean expression. For example, each subscription in the pub/sub system of a diverse on- line vendor may describe the conditions under which a customer may purchase a product. A customer interested in acquiring a cam- era may post his/her requirement as a subscription to the vendor’s pub/sub system by providing the Boolean expression: item =“cameraprice < $300 manuf acturer ∈{Sony, N ikon, P anasonic}∧ zoom > 4× This subscription uses four attributes of a product, namely, item, price, manuf acturer and zoom. An attribute is also referred This research was supported, in part, by the US Air Force Re- search Laboratory, under grant FA8750-11-1-0245. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00. to as a dimension. A predicate consists of an attribute, an oper- ator and attribute value(s). For each permissible value of an at- tribute, the predicate evaluates to true or false. In the above exam- ple, price < $300 is a predicate that is true whenever the attribute price has a value below $300 and false otherwise. A subscrip- tion is the conjunction of predicates. Our example subscription is the conjunction of 4 predicates. An event specifies the values of some attributes. For example the availability of a new $199 5x zoom camera from “Sony" or a price change in an existing 5x zoom “Sony" camera to $199 may be specified by the event: item =“cameracolor =“redweight =8oz pixels = 14M price = $199manuf acturer =“Sonyzoom =5× The above event matches the example subscription as all 4 predi- cates in the subscription evaluate to true when the attributes used in the subscription are assigned the values specified in the event. The example subscription, however, is not matched by the following events: item =“camerapixels = 14M price = $399 manuf acturer = Sonyzoom = 5× item =“cameraprice = $129 manuf acturer =“Sony The first of these events fails to match the subscription because the price is too high and the second fails because it does not specify the value of an attribute (zoom) that occurs in the subscription. When an event occurs, the pub/sub system reports all subscrip- tions in its database that are matched (or satisfied by the event). Customers who posted these matching subscriptions may then be notified. Pub/sub systems are used in diverse applications with varied per- formance requirements. For example, in some applications events occur at a much higher rate than the posting/removal of subscrip- tions while in other applications the subscription rate may be much higher than the event rate and in yet other applications the two rates may be comparable. Optimal performance in each of these scenarios may result from deploying a different data structure for the subscriptions or a different tuning of the same structure. Many commercial applications of pub/sub systems have thousands of at- tributes and millions of subscriptions. So, scalability in terms of number of attributes and number of subscriptions is critical. In this paper, we describe the architecture of PUBSUB, which is a versatile and scalable pub/sub system that may be tuned to provide high performance for diverse application environments. PUBSUB is versatile because its architecture supports a variety of predicate types (e.g., ranges, regular expressions, string relations) as well as a heterogeneous collection of data structures for the representation of subscriptions in order to achieve high throughput. The performance