Research Article
Optimizing the Deep Neural Networks by Layer-Wise Refined
Pruning and the Acceleration on FPGA
Hengyi Li ,
1
Xuebin Yue ,
1
Zhichen Wang ,
1
Zhilei Chai,
2
Wenwen Wang,
3
Hiroyuki Tomiyama,
1
and Lin Meng
1
1
Department of Electronic and Computer Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan
2
School of AI and Computer Science, Jiangnan University, Wuxi, China
3
Department of Computer Science, University of Georgia, Athens, GA, USA
Correspondence should be addressed to Lin Meng; menglin@fc.ritsumei.ac.jp
Received 15 February 2022; Revised 8 March 2022; Accepted 22 March 2022; Published 1 June 2022
Academic Editor: M. Hassaballah
Copyright © 2022 Hengyi Li et al. is is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
To accelerate the practical applications of artificial intelligence, this paper proposes a high efficient layer-wise refined pruning
method for deep neural networks at the software level and accelerates the inference process at the hardware level on a field-
programmable gate array (FPGA). e refined pruning operation is based on the channel-wise importance indexes of each layer
and the layer-wise input sparsity of convolutional layers. e method utilizes the characteristics of the native networks without
introducing any extra workloads to the training phase. In addition, the operation is easy to be extended to various state-of-the-art
deep neural networks. e effectiveness of the method is verified on ResNet architecture and VGG networks in terms of dataset
CIFAR10, CIFAR100, and ImageNet100. Experimental results show that in terms of ResNet50 on CIFAR10 and ResNet101 on
CIFAR100, more than 85% of parameters and Floating-Point Operations are pruned with only 0.35% and 0.40% accuracy loss,
respectively. As for the VGG network, 87.05% of parameters and 75.78% of Floating-Point Operations are pruned with only 0.74%
accuracy loss for VGG13BN on CIFAR10. Furthermore, we accelerate the networks at the hardware level on the FPGA platform by
utilizing the tool Vitis AI. For two threads mode in FPGA, the throughput/fps of the pruned VGG13BN and ResNet101 achieves
151.99 fps and 124.31 fps, respectively, and the pruned networks achieve about 4.3× and 1.8× speed up for VGG13BN and
ResNet101, respectively, compared with the original networks on FPGA.
1. Introduction
Convolutional neural networks (CNNs) have brought sig-
nificant revolutionary progress in almost all sciences with
the availability of massive data and high-performance
hardware since the milestone architecture AlexNet proposed
in 2012 [1]. Deep neural networks (DNNs) have been applied
for multiple tasks such as target classification, detection, and
recognition. Various fields including computer vision,
natural language processing [2], cultural heritage reorga-
nization and protection [3], environment monitoring [4],
Internet of ings [5], and service ecosystems [6], have been
made the scientific breakthrough with the support of the
technique. However, as DNNs achieve great performance
improvement such as high accuracy with the deeper and
larger size, problems follow (the codes related to the ex-
periments of the study are available at: https://github.com/
lihengyi-ai/2022CIaN).
e deeper architecture and larger size of DNNs take
overwhelming computing resources with a large amount of
redundancy [7]. e high intensity of computation and
memory access leads to a heavy burden for deep neural
networks, which is a huge obstacle that blocks the high-
efficiency applications of artificial intelligence inference,
especially for those resource-constrained hardware plat-
forms. en, it has been greatly important to compress and
accelerate the DNNs from both the software level and
hardware level, especially for those hardware resources and
energy-limited terminal devices. Various studies have been
made to optimize DNNs and have achieved significant
Hindawi
Computational Intelligence and Neuroscience
Volume 2022, Article ID 8039281, 22 pages
https://doi.org/10.1155/2022/8039281