Information Fusion on Concurrent Computing Systems Daniel M. Zimmerman and K. Mani Chandy Computer Science 256-80 California Institute of Technology Pasadena, California 91125 USA dmz@cs.caltech.edu mani@cs.caltech.edu Abstract Detecting and responding to critical states in the envi- ronment has become increasingly important in dealing with man-made and natural crises. The state of the extended environment includes data from geographically dispersed sites and of various forms—numerical, textual, visual. This data comes from multiple sources such as databases, news sources and Web services, and has varying degrees of accu- racy. For some applications, responses in seconds are crit- ical. Analyzing events with high occurrence rates and het- erogeneous data in seconds requires high-performance sys- tems. This paper explores algorithms for scheduling proces- sors in shared-memory multiprocessors and in distributed computing grids to reduce response times for critical state detection. The collection of algorithms explored all have a common characteristic: events arriving on different streams are assumed to have (reasonably accurate) timestamps, and a collection of events with approximately the same time- stamp is treated as a global snapshot—a valid state—of the environment. The algorithms use concepts, such as null messages, from distributed simulation to improve their effi- ciency. Quantitative performance results are presented. 1. Overview The problem of dealing with rapidly-evolving crises has received increasing attention in the scientific literature and news in the last few years. The problem has three main parts: (1) specifying situations that require responses and specifying the responses; (2) detecting these situations; and (3) responding appropriately. This paper deals with using parallel computing systems to improve the performance of the second part: detecting critical situations. Consider the following example of interest to the De- partment of Homeland Security. The problem is to detect bio-terrorist attacks and initial signs of epidemics by cor- relating multiple streams of possibly low-confidence data streams from sensors, local and national public health infor- mation networks, and cues from indicators such as news and government sources indicating geographical locations, tac- tics and timing of possible attacks. The detection of a pos- sible dangerous situation results in a response: the response may be further analyses or an action such as alerting pub- lic health officials. Similar problems arise in the detection of many threats, such as radiation sources, earthquakes and tsunamis. For in- stance, data from multiple sensors in the earth can be corre- lated to give a few seconds of warning (enough to stop el- evators at floors) before the shock wave of an earthquake hits. This paper explores algorithms for scheduling proces- sors in shared-memory multiprocessors and in distributed computing grids to reduce response times for this type of correlation. The remainder of the paper is structured as fol- lows. First, a detection model based on graphs of processing nodes is described and design issues related to the imple- mentation of this model are discussed. Next, the scheduling algorithms themselves, the benefits and drawbacks of each, and the simulation environment developed to test them are described. Finally, the simulation results and conclusions are presented, along with a brief discussion of related work. 2. Graph Model of Detection 2.1. Model-Based Detection A critical-state detection system, or detection system for short, fuses—correlates, integrates and analyzes—data that it obtains from multiple sources and then sends a message invoking an appropriate response if a response is merited. A situation that merits a response can be defined as one in which measurement data fits a model of a critical situation or as an anomaly (anomalies are discussed later). A corre- lation system has a set of models, some of which represent