Integrating Task Allocation, Planning, Scheduling, and Adaptive Resource Management to Support Autonomy in a Global Sensor Web John S. Kinnebrew, Gautam Biswas, Nishanth Shankaran, and Douglas C. Schmidt EECS Department & ISIS, Vanderbilt University, Nashville, TN 37203, USA john.s.kinnebrew@vanderbilt.edu Dipa Suri Lockheed-Martin Space Systems Company Advanced Technology Center 3251 Hanover Street, Palo Alto, CA. 94304 dipa.suri@lmco.com Abstract NASA’s Earth Science Vision calls for a global sensor web comprised of heterogeneous platforms with on-board information processing, capable of orchestrating real-time collaborative operations with other platforms and ground stations. Such a global sensor web will be a system of systems, including many distributed real-time embedded (DRE) systems, such as multi-satellite formations. Individ- ual systems of the sensor web must collect and analyze large quantities of data via sequences of heterogeneous data col- lection, manipulation, and coordination tasks to meet spec- ified goals for earth science applications. In large DRE systems, such as those composing a global sensor web, the sheer number of available components often poses a com- binatorial planning problem for identifying component se- quences to achieve specified goals. Moreover, the dynamic nature of these systems requires runtime management and modification of deployed components. We present the design of the Multi-agent Architecture for Coordinated Responsive Observations which includes two novel services contributing to the design and deployment of autonomous, predictable, and high performance DRE systems that operate in dynamic and uncertain environ- ments: (i) the Spreading Activation Partial Order Planner (SA-POP) that performs decision-theoretic planning and scheduling using a spreading activation network to capture the probabilistic functional relationships between tasks (im- plemented as components) and goals; and (ii) the Resource Allocation and Control Engine (RACE), which is an open- source adaptive resource management framework built atop standards-based QoS-enabled component middleware. We illustrate the effectiveness of our approach in the face of changing operational conditions, workloads, and resource availability, in the context of salient Earth science missions. 1. Introduction Remote sensing missions for Earth Science provide a wealth of information to help scientists understand the dynamics of our planet. Conventional approaches use a stove-pipe operational model, such as each spacecraft or in situ sensor cluster commanded by, and transmitting data to, large dedicated ground operations centers. These ap- proaches, however, introduce untenable latencies in devel- oping data products that hinder model building and refine- ment. Moreover, the inherent communication lag and po- tentially limited bandwidth necessitates autonomous plan- ning and resource allocation at a local level to effectively achieve goals under rapidly evolving environmental and system conditions. To address these limitations, NASA’s Earth Science Vision calls for a global sensor web com- prised of heterogeneous platforms with on-board informa- tion processing, capable of orchestrating real-time collab- orative operations with other platforms and ground sta- tions [7]. Many of the platforms comprising the sensor web will be distributed, real-time embedded (DRE) systems, such as spacecraft and airborne systems. Modern DRE systems implement task sequences, such as data processing, using component middleware [6], which automates remoting, life- cycle management, system resource management, deploy- ment, and configuration. However, in large-scale DRE sys- tems, the sheer number of component sequences often poses a combinatorial deployment problem, i.e., mapping com- ponents to computing nodes [14]. Moreover, the dynamic nature of the operations require runtime management and modification of deployments [5]. At the level of individual platforms (e.g. individual satellites and ground-based sen- sor installations) these problems necessitate a system with the ability to make resource allocation and control decisions at runtime. More effective solutions to this problem pro-