Adaptive Hardware Architecture for
Neural-Network-on-Chip
Kasem Khalil
1,3
, Bappaditya Dey
2,5
, Ashok Kumar
2
, Magdy Bayoumi
4
1
Electrical and Computer Engineering, University of Mississippi, Mississippi, USA
2
The Center for Advanced Computer Studies, University of Louisiana at Lafayette, LA, USA
3
Department of Electrical Engineering, Assiut University, Egypt
4
Department of Electrical and Computer Engineering, University of Louisiana at Lafayette, LA, USA
5
imec (Belgium), Kapeldreef 75, 3001 Leuven, Belgium
Emails: k khalil@aun.edu.eg, bxd9836@louisiana.edu, {ashok.kumar, mab0778}@louisiana.edu
Abstract—Neural networks are beneficial for several
applications in classification and regression problems. Hardware
implementation of a neural network is a challenge in terms
of making adaptive architecture to fit different applications.
Network-on-Chip (NoC) is an efficient method in power and
bandwidth to make communication between several nodes. This
paper proposes an adaptive neural network based on NoC. NoC
consists of routers and Process Elements (PEs), and each router
is connected to its PE which consists of m nodes. These nodes are
used to construct layers for a neural network. A configuration
packet is used to show the number of layers and of nodes
per layer. The proposed method can assign multiple routers to
represent a layer based on the required configuration. Thus, the
proposed method supports the construction of different layers
with the adaptability of different nodes configuration (desired
number) appropriate for the target application. The proposed
method is implemented on FPGA Altera 10 GX, and it achieves
an accuracy of 98.18% with the MNIST dataset. It has been also
tested using multiple datasets. The results show the proposed
method has comparable resource utilization with the traditional
method.
Keywords: Neural network, NoC, hardware reconfigura-
tion, FPGA architecture.
I. I NTRODUCTION
Studying neuroscience to understand the complex circuitry
of the human brain has been an ongoing endeavor. With
the advent of modern electronics, such quest has triggered
numerous researchers and scientists to model neural networks
with electrical circuits in different paradigms such as digital
[1, 2], analog [3], and optical. Based on the application
domain and learning complexity, architecture variants have
been proposed as Feed-Forward Deep Networks (FFDN) [4],
Convolutional Neural Network (CNN) [5, 6], Recurrent Neural
Network (RNN) [7]. Fig. 1 shows an artificial neural network
architecture with a single input and output layer and an
arbitrary length (L) of hidden layers. Architecture implemen-
tations, both in hardware and software, depend on a heuristic
selection of a number of hidden layers and counts of neurons
or nodes per layer as well as the complexity of the application
domain. This concept has emerged as a new research paradigm
in recent years to optimize the neural network architecture [8].
A neural network generally learns a target function, through
an iterative process of “forward pass” and “backward pass”
Fig. 1: Block diagram of neural network architecture.
[9]. In a forward pass, input data or features traverse through
all hidden layers as well as all nodes from input to output
layers. A loss metric is calculated by subtracting the predicted
value from the ground truth/target value. “Backward pass”
involves a process of weighing changes in assigned weights at
corresponding layers and nodes through gradient descent [10]
based on optimization of the loss metric.
Hardware implementation of Artificial Neural Network
(ANN) has been realized while considering a balanced trade-
off between multi-variate parameters such as hardware com-
plexity, area overhead, throughput, power dissipation, using
analog, digital or mixed-signal circuits [11–14]. In [15] work,
G.-D. Guglielmo et al. proposed an ASIC-based NN autoen-
coder model for automated efficient data compression and
encoding before transmission. The implementation is based
on a fixed NN architecture, but the flexibility can be achieved
through programmable weights. The architecture is based on
LP CMOS 65 nm technology node with an area of 3.6
mm2 and power consumption per inference of 2.38 nJ. In
H. Irmak [16] et al. work, FPGA-based dynamically recon-
figurable architecture for energy-efficient NN accelerators has
been proposed. The proposed method utilizes a Dynamic
Partial Reconfigurability (DPR) technique and maximizes the
throughput performance. The advantage of the proposed ap-
proach is that it enables the realization of different neural
types and architecture for different applications with partial
improvement of the FPGA configuration. In Thi Diem Tran
978-1-6654-0279-8/22/$31.00 ©2022 IEEE
2022 IEEE 65th International Midwest Symposium on Circuits and Systems (MWSCAS) | 978-1-6654-0279-8/22/$31.00 ©2022 IEEE | DOI: 10.1109/MWSCAS54063.2022.9859323
Authorized licensed use limited to: University of Louisiana at Lafayette. Downloaded on September 23,2022 at 15:43:07 UTC from IEEE Xplore. Restrictions apply.