Toward Improving the Robustness of Deep Learning Models via
Model Transformation
Yingyi Zhang
State Key Laboratory of
Communication Content Cognition,
People’s Daily Online, Beijing, China
100733; College of Intelligence and
Computing, Tianjin University
yingyizhang@tju.edu.cn
Zan Wang
State Key Laboratory of
Communication Content Cognition,
People’s Daily Online, Beijing, China
100733; College of Intelligence and
Computing, Tianjin University
wangzan@tju.edu.cn
Jiajun Jiang
∗
College of Intelligence and
Computing, Tianjin University
Tianjin, China
jiangjiajun@tju.edu.cn
Hanmo You
College of Intelligence and
Computing, Tianjin University
Tianjin, China
youhanmo@tju.edu.cn
Junjie Chen
College of Intelligence and
Computing, Tianjin University
Tianjin, China
junjiechen@tju.edu.cn
ABSTRACT
Deep learning (DL) techniques have attracted much attention in
recent years, and have been applied to many application scenarios,
including those that are safety-critical. Improving the universal
robustness of DL models is vital and many approaches have been
proposed in the last decades aiming at such a purpose. Among
existing approaches, adversarial training is the most representa-
tive. It advocates a post model tuning process via incorporating
adversarial samples. Although successful, they still sufer from the
challenge of generalizability issues in the face of various attacks
with unsatisfactory efectiveness. Targeting this problem, in this
paper we propose a novel model training framework, which aims at
improving the universal robustness of DL models via model trans-
formation incorporated with a data augmentation strategy in a
delta debugging fashion. We have implemented our approach in a
tool, called Dare, and conducted an extensive evaluation on 9 DL
models. The results show that our approach signifcantly outper-
forms existing adversarial training techniques. Specifcally, Dare
has achieved the highest Empirical Robustness in 29 of 45 testing
scenarios under various attacks, while the number drops to 5 of 45
for the best baseline approach.
CCS CONCEPTS
· Computing methodologies → Neural networks; · Software
and its engineering → Software testing and debugging.
∗
Jiajun Jiang is the corresponding author.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a
fee. Request permissions from permissions@acm.org.
ASE ’22, October 10ś14, 2022, Rochester, MI, USA
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9475-8/22/10. . . $15.00
https://doi.org/10.1145/3551349.3556920
KEYWORDS
Deep Neural Network, Delta Debugging, Model Robustness
ACM Reference Format:
Yingyi Zhang, Zan Wang, Jiajun Jiang, Hanmo You, and Junjie Chen. 2022.
Toward Improving the Robustness of Deep Learning Models via Model
Transformation. In 37th IEEE/ACM International Conference on Automated
Software Engineering (ASE ’22), October 10ś14, 2022, Rochester, MI, USA. ACM,
New York, NY, USA, 13 pages. https://doi.org/10.1145/3551349.3556920
1 INTRODUCTION
In recent years, deep learning (DL) techniques have attracted much
attention from researchers, and have been prevalently used in both
industrial practice and academic research, such as image process-
ing [83, 85], machine translation [32, 49] and software engineer-
ing [5, 42, 65, 76], etc. Particularly, some application scenarios are
safety-critical, such as autonomous driving [4, 43, 84, 92] and air-
craft collision avoidance [31]. However, as reported by existing
studies [9, 58, 63] DL models, in practice, are fragile when facing
perturbations and thus easy to be attacked by hackers. For example,
researchers from Tencent Keen Security Lab successfully tricked
the lane detection system of Tesla Model S with three small adver-
sarial sticker images, making it swerve into the wrong lane without
any warnings or precautions [1]. Therefore, it is vital to ensure the
safety and enhance the adversarial robustness of DL models in the
face of potential adversarial attacks.
Unlike traditional handcrafted programs that are deterministic
with a fxed code logic defned by a set of executable machine in-
structions, deep learning models are built based on a set of input
examples. That is, when providing a set of training examples, a
model with a set of parameters will be learned according to a pre-
defned neural network structure, which is expected to meet the
functionality requirement, such as image classifcation. However,
since the number of input examples is limited and the complete
input space is usually enormous or infnite in practice, also known
as the incomplete specifcation issue in traditional software engi-
neering tasks like programming by examples [14ś18, 24, 33, 35],
the learned model may not work well on unseen inputs, especially
those samples that are decorated with crafted attacking features.