Poster Abstract: ECSRL: A Learning-Based Scheduling
Framework for AI Workloads in Heterogeneous Edge-Cloud
Systems
Changyao Lin
Harbin Institute of Technology
Harbin, China
20S003095@stu.hit.edu.cn
Ziyang Zhang
Harbin Institute of Technology
Harbin, China
20B903026@stu.hit.edu.cn
Huan Li
Harbin Institute of Technology (Shenzhen)
Shenzhen, China
huanli@hit.edu.cn
Jie Liu
Harbin Institute of Technology (Shenzhen)
Shenzhen, China
jieliu@hit.edu.cn
ABSTRACT
Recent advances in both lightweight models and edge computing
make it possible for inference tasks to be executed concurrently on
resource-constrained edge devices. However, our preliminary ex-
periments show that the execution of diferent lightweight models
on edge devices may lead to a performance downgrade. In this pa-
per, we propose a Learning-Based Scheduling FrameworkÐECSRL,
to optimize the latency and power consumption for those inference
tasks running in heterogeneous Edge-Cloud systems.
CCS CONCEPTS
· Computing methodologies → Planning and scheduling.
KEYWORDS
Heterogeneous Edge Computing; Task Scheduling; Reinforcement
Learning
ACM Reference Format:
Changyao Lin, Ziyang Zhang, Huan Li, and Jie Liu. 2021. Poster Abstract:
ECSRL: A Learning-Based Scheduling Framework for AI Workloads in
Heterogeneous Edge-Cloud Systems. In Proceedings of The 19th ACM In-
ternational Conference on Embedded Networked Sensor Systems (SenSys),
Nov 15-17, 2021, Coimbra, Portugal. ACM, New York, NY, USA, 2 pages.
https://doi.org/10.1145/3485730.3492886
1 INTRODUCTION
As a new computing paradigm, edge AI [1] and mobile edge com-
puting (MEC) [2] sink computing power from the cloud to edge AI
devices, making it possible to inference Deep Learning (DL) work-
load in real-time on the edge. Compared with cloud computing
cluster, edge devices have the advantages of low latency, low power
consumption, low price, and easy deployment, etc. The advantages
and disadvantages of both are summarized in Table 1 below.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
SenSys ’21, November 15–17, 2021, Coimbra, Portugal
© 2021 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9097-2/21/11. . . $15.00
https://doi.org/10.1145/3485730.3492886
Table 1: Edge Device versus Cloud Server
Paradigm Delay Accuracy Power Price
Edge + Light Model Low Low Low Low
Cloud + Large Model High High High High
In order to understand the performance when multiple DL-based
workloads run concurrently on an edge device, we set up a test-
bed and found that the completion time of newly arrived AI task
will be afected under concurrent situation. Also, as mentioned in
the reference, most of the existing cluster scheduling algorithms
are either based on simulation or homogeneous virtual machine
clusters [3]. Therefore, it is essential to rethink to design efcient
task scheduling algorithms for edge clusters with heterogeneous
devices.
In this paper, we propose to adopt reinforcement learning strat-
egy to design a task scheduling framework for heterogeneous Edge-
Cloud systems. The objective is to minimize the power consumption
and completion time. Meanwhile, if the lightweight DL workload
deployed on the edge device could not meet the accuracy require-
ment, the system will automatically ofoad the task to the cloud to
run a high-precision model for a better inference result.
2 EXPERIMENTAL STUDY
To measure the performance of concurrent workloads execution at
edge devices, in our test-bed, two lightweight models, ResNet-18
and SSD-MobileNet-v2 are deployed on NVIDIA Xaiver NX.
Figure 1 shows the performance matrix for ResNet-18 and SSD-
MobileNet-v2 on edge device NX respectively. Here, m denotes the
number of SSD-MobileNet-v2, r denotes the number of ResNet-18.
The elements in matrix represent the ratio of the running time of
the added workload concurrently with other existing workloads
divided by the time when the system only executes a single such
workload. For example, in Figure 1(a), the value of [1][0] represents
the ratio of the inference time when ResNet-18 is deployed on NX
that has already runs 1 SSD-MobileNet-v2 and 0 ResNet-18 over the
inference time when ResNet-18 is deployed on the device NX alone.
Since the time to complete a single inference task on a specifc
device is a fxed value, in this fgure, the larger values indicate the
performance degradation (longer completion time) as more tasks
added into the system.
386