Adaptive Hardware Architecture for Neural-Network-on-Chip Kasem Khalil 1,3 , Bappaditya Dey 2,5 , Ashok Kumar 2 , Magdy Bayoumi 4 1 Electrical and Computer Engineering, University of Mississippi, Mississippi, USA 2 The Center for Advanced Computer Studies, University of Louisiana at Lafayette, LA, USA 3 Department of Electrical Engineering, Assiut University, Egypt 4 Department of Electrical and Computer Engineering, University of Louisiana at Lafayette, LA, USA 5 imec (Belgium), Kapeldreef 75, 3001 Leuven, Belgium Emails: k khalil@aun.edu.eg, bxd9836@louisiana.edu, {ashok.kumar, mab0778}@louisiana.edu Abstract—Neural networks are beneficial for several applications in classification and regression problems. Hardware implementation of a neural network is a challenge in terms of making adaptive architecture to fit different applications. Network-on-Chip (NoC) is an efficient method in power and bandwidth to make communication between several nodes. This paper proposes an adaptive neural network based on NoC. NoC consists of routers and Process Elements (PEs), and each router is connected to its PE which consists of m nodes. These nodes are used to construct layers for a neural network. A configuration packet is used to show the number of layers and of nodes per layer. The proposed method can assign multiple routers to represent a layer based on the required configuration. Thus, the proposed method supports the construction of different layers with the adaptability of different nodes configuration (desired number) appropriate for the target application. The proposed method is implemented on FPGA Altera 10 GX, and it achieves an accuracy of 98.18% with the MNIST dataset. It has been also tested using multiple datasets. The results show the proposed method has comparable resource utilization with the traditional method. Keywords: Neural network, NoC, hardware reconfigura- tion, FPGA architecture. I. I NTRODUCTION Studying neuroscience to understand the complex circuitry of the human brain has been an ongoing endeavor. With the advent of modern electronics, such quest has triggered numerous researchers and scientists to model neural networks with electrical circuits in different paradigms such as digital [1, 2], analog [3], and optical. Based on the application domain and learning complexity, architecture variants have been proposed as Feed-Forward Deep Networks (FFDN) [4], Convolutional Neural Network (CNN) [5, 6], Recurrent Neural Network (RNN) [7]. Fig. 1 shows an artificial neural network architecture with a single input and output layer and an arbitrary length (L) of hidden layers. Architecture implemen- tations, both in hardware and software, depend on a heuristic selection of a number of hidden layers and counts of neurons or nodes per layer as well as the complexity of the application domain. This concept has emerged as a new research paradigm in recent years to optimize the neural network architecture [8]. A neural network generally learns a target function, through an iterative process of “forward pass” and “backward pass” Fig. 1: Block diagram of neural network architecture. [9]. In a forward pass, input data or features traverse through all hidden layers as well as all nodes from input to output layers. A loss metric is calculated by subtracting the predicted value from the ground truth/target value. “Backward pass” involves a process of weighing changes in assigned weights at corresponding layers and nodes through gradient descent [10] based on optimization of the loss metric. Hardware implementation of Artificial Neural Network (ANN) has been realized while considering a balanced trade- off between multi-variate parameters such as hardware com- plexity, area overhead, throughput, power dissipation, using analog, digital or mixed-signal circuits [11–14]. In [15] work, G.-D. Guglielmo et al. proposed an ASIC-based NN autoen- coder model for automated efficient data compression and encoding before transmission. The implementation is based on a fixed NN architecture, but the flexibility can be achieved through programmable weights. The architecture is based on LP CMOS 65 nm technology node with an area of 3.6 mm2 and power consumption per inference of 2.38 nJ. In H. Irmak [16] et al. work, FPGA-based dynamically recon- figurable architecture for energy-efficient NN accelerators has been proposed. The proposed method utilizes a Dynamic Partial Reconfigurability (DPR) technique and maximizes the throughput performance. The advantage of the proposed ap- proach is that it enables the realization of different neural types and architecture for different applications with partial improvement of the FPGA configuration. In Thi Diem Tran 978-1-6654-0279-8/22/$31.00 ©2022 IEEE 2022 IEEE 65th International Midwest Symposium on Circuits and Systems (MWSCAS) | 978-1-6654-0279-8/22/$31.00 ©2022 IEEE | DOI: 10.1109/MWSCAS54063.2022.9859323 Authorized licensed use limited to: University of Louisiana at Lafayette. Downloaded on September 23,2022 at 15:43:07 UTC from IEEE Xplore. Restrictions apply.