Dynamic Resource Management Across Cloud-Edge Resources for Performance-Sensitive Applications Shashank Shekhar Vanderbilt University Nashville, TN 37235, USA Email: shashank.shekhar@vanderbilt.edu Aniruddha Gokhale (Adviser) Vanderbilt University Nashville, TN 37235, USA Email:agokhale@vanderbilt.edu Abstract—A large number of modern applications and systems are cloud-hosted, however, limitations in performance assurances from the cloud, and the longer and often unpredictable end- to-end network latencies between the end user and the cloud can be detrimental to the response time requirements of the applications, specifically those that have stringent Quality of Service (QoS) requirements. Although edge resources, such as cloudlets, may alleviate some of the latency concerns, there is a general lack of mechanisms that can dynamically manage resources across the cloud-edge spectrum. To address these gaps, this research proposes Dynamic Data Driven Cloud and Edge Systems (D 3 CES). It uses measurement data collected from adaptively instrumenting the cloud and edge resources to learn and enhance models of the distributed resource pool. In turn, the framework uses the learned models in a feedback loop to make effective resource management decisions to host applications and deliver their QoS properties. D 3 CES is being evaluated in the context of a variety of cyber physical systems, such as smart city, online games, and augmented reality applications. Keywords-Cloud Computing, Edge Computing, Fog Comput- ing, Resource Management, DDDAS, CPS, IoT. I. I NTRODUCTION The elastic properties and cost benefits of the cloud has made it an attractive hosting platform for a variety of soft real-time cyber physical systems (CPS)/Internet of Things (IoT) applications, such as cognitive assistance, patient health monitoring and industrial automation. The stringent quality of service (QoS) considerations of these applications mandate both predictable performance from the cloud and lower end- to-end network latencies between the end user and the cloud. To date, security and performance assurance continues to be a hard problem to resolve in cloud platforms due to their virtualized and multi-tenant nature [12]. Although recent advances in fog and edge computing have enabled cloud resources to move closer to the CPS/IoT devices thereby mitigating the network latency concerns to some extent [3], there is still a general lack of mechanisms that can dynamically manage resources across the cloud-edge spectrum. This is a hard problem to resolve due to the highly dynamic behaviors of the edge and cloud. Consequently, any pre-defined and fixed set resource management policies will be rendered useless for hosting CPS/IoT applications in the cloud. The dynamic data driven application systems (DDDAS) paradigm [10] addresses precisely these challenges. DDDAS prescribes an approach where applications are instrumented adaptively so that their models can be learned and enhanced continuously, and in turn these models can be analyzed and used in a feedback loop to steer the applications along their intended trajectories. Previous work have focused on a specific application or applied DDDAS for resilience and security [2]. We propose to apply the DDDAS principle to the pool of resources spanning the cloud-edge spectrum to enable and enforce dynamic resource management decisions that deliver the required QoS properties of cloud-hosted applications. To that end we propose Dynamic Data Driven Cloud and Edge Systems (D 3 CES), which uses performance data collected from adaptively instrumenting the cloud and edge resources to learn and enhance models of the distributed resource pool, and in turn using these models in a feedback loop to make effective resource management decisions to host CPS applications and deliver their QoS properties. Rest of this paper is organized as follows, Section II dis- cusses the challenges faced in realizing D 3 CES, Section III discuss the current state of the art, Section IV proposes a set of solutions that form the contours of this doctoral research and Section V provides the concluding remarks. II. KEY RESEARCH CHALLENGES AND SOLUTION NEEDS Our research calls for an effective use of resources across the cloud data centers (CDCs) and the micro data centers (MDCs) that reside at the edge. The following sub-sections lists a non-exhaustive set of challenges along three dimensions that we are addressing in this doctoral research. A. Application-imposed Challenges 1) Workload variations: The workload generated by CPS/IoT applications may illustrate both transient and sustained variability which needs to be predicted and addressed. 2) Stochastic execution semantics: For some CPS/IoT applications, their uncertain and dynamic nature may require several instances of the same tasks to be executed to reach specified confidence levels. Each execution may take different execution times but impose certain QoS needs. 3) Application structure: Increasingly, cloud-based appli- cations are realized as a collection of communicating microservices, which can be deployed independently