Integrated Control of Matching Delay and CPU Utilization in Information Dissemination Systems Ming Chen, Xiaorui Wang, and Ben Taylor Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville, TN 37996 {mchen11, xwang, btaylo22}@utk.edu Abstract— The demand for high performance information dissemination is increasing in many applications, such as e- commerce and security alerting systems. These applications usually require that the desired information be matched between sources and sinks based on established subscriptions in a timely manner while a maximal system throughput be achieved to find more matched results. Existing work primarily focuses on only one of the two requirements, either timeliness or throughput. This can lead to an unnecessarily underutilized system or poor guarantees on matching delays. In this paper, we propose an integrated solution that controls both the matching delay and CPU utilization in information dissemination systems to achieve bounded matching delay for high-priority information and maximized system throughput in an example information dissemination system. Our solution is based on optimal control theory for guaranteed control accuracy and system stability. Empirical results on a physical testbed demonstrate that our controllers can guarantee the timeliness requirements while achieving maximized system throughput. I. I NTRODUCTION During the last decade, information dissemination has started to play a critical role in the design and develop- ment of a large class of applications. For example, buyers and sellers need to be matched based on their interests in e-commerce systems, and notified immediately when new business opportunities are identified. Likewise, in security alerting systems, threats detected by various sensors must be reported to the appropriate authorities within certain time constraint. In these applications, matches between numerous (e.g., thousands of) sources and numerous sinks should be found accurately, efficiently, and more importantly, in a timely manner. These requirements have been generally described as Valuable Information at the Right Time (VIRT) [1]. This emphasizes that consumers of information should receive the accurate information that is of interest to them as soon as it is available or whenever it is requested. In the meantime, a maximal possible system throughput should be achieved to find more matched results between data sources and sinks. INFOD (INFOrmation Dissemination) [2] is an example information dissemination system that aims to support timely delivery of valuable information for a wide range of applica- tions. As shown in Figure 1, information sources and sinks are defined as publishers and consumers, respectively. Subscrip- tions are prescribed requests of information and are submitted by subscribers on behalf of consumers. Publishers, consumers, and subscribers advertise their attributes and constraints, which Publisher Publishers Publisher Subscribers Publisher Consumers Webservice Specification Publisher Subscriptions Metadata matching Registry Consumer INFOD Publisher Entries Consumer Entries Fig. 1. INFOD: an example information dissemination system are generally referred to as metadata, in a database called registry. For example, a consumer may have its location as an attribute and have a constraint on desired publishers: they must be located within 10 miles. An example subscription is: all sensors (publishers) send their traf fic information to all drivers (consumers) within 10 miles no later than 5 seconds after a jam occurs. Subscriptions may have different priorities. For example, subscriptions of traf fic information from police should have a higher priority than those from ordinary drivers. Metadata can be updated periodically and aperiodically. IN- FOD needs to find new matches between the metadata of publishers and consumers based on the subscriptions within a time constraint when a metadata update arrives, which we refer to as subscription reevaluation or metadata matching. Meanwhile, more matched results are desired by reevaluating the maximal number of subscriptions, which means that the registry server should be ef ficiently utilized. Based on the matched results, publishers are informed where to send filtered information. The information is then sent to the matched consumers without passing through INFOD. To guarantee both the bounded matching delay and ef ficient system utilization in information dissemination systems faces two major challenges. First, when an update arrives, the system may need to reevaluate all related subscriptions to find new matching results between the publishers and consumers. For example, a driver may constantly update its location attribute. As a result, all related subscriptions in the registry need to be continuously reevaluated by rerunning metadata matching to ensure that the driver receives information from