An Efficient Approach for Neural Network Architecture
Kasem Khalil, Omar Eldash, Ashok Kumar, Magdy Bayoumi
The Center for Advanced Computer Studies
University of Louisiana at Lafayette, Louisiana, USA
Emails: kmk8148, oke1206, axk1769, mab0778@louisiana.edu
Abstract—Neural network is one of the main concepts used in ma-
chine learning applications. The hardware realization of neural network
requires a large area to implement a network with many hidden layers.
This paper presents a novel design of a neural network to reduce the
hardware area. The proposed approach reduces the number of physical
hidden layers from N to N/2 while maintaining full accuracy with a
minimal increase in time complexity. The proposed approach adopts the
concept of multiplexing input and output layers of the neural network.
The approach is implemented based on Tensorflow framework and
Xilinx Virtex-7 FPGA. The simulation results show the accuracy of the
proposed approach is the same as expected from traditional network,
which uses N layers, while using only N/2 hardware layers. The hardware
implementation results show the proposed approach saves 42% area.
Keywords: Neural Network, Deep Learning, Multiplexing, Image
Recognition, Pattern Recognition, Convolutional Neural Network.
I. I NTRODUCTION
Neural Networks (NN) are widely used as classifiers for data
classification applications and it is becoming pervasive in applications
such as: speech recognition [1], computer vision, image recognition,
natural language processing and decision making [2]. Classification
techniques involve predicting a certain output based on a given
input. Many neural network models have been proposed to associate
between data sets and then predict based on given data. The algorithm
after training should detect appropriate relationships between the
attributes in order to precisely predict the actual output for new
inputs. The accuracy of prediction resembles the efficiency of the
algorithms in recognizing new patterns depending on the training of
the algorithm.
An NN is an important system as it works differently from
traditional computing in digital processors and it closely mimics
processing by a human brain. The human brain is a nonlinear, very
complex, and parallel information processing system. It has the ability
to arrange its neurons to perform certain operations faster than the
fastest digital computers. The brain can recognize familiar faces in
an unfamiliar scene within approximately 100-200 ms. However,
conventional computer may take long time to do even less complex
tasks. Thus, a neural network works as a designed machine to
model the way in which the brain runs a certain task. The NN can
be simulated in software to perform a task and it can be further
implemented and accelerated by hardware. NN is based on parallel
computation which uses multiple layers that consists of basic units
called nodes or neurons. Each node performs a process on the input
data and sends the output to the next layer nodes.
Deep Learning (DL), in recent years, has become the state of the
art for Machine Learning(ML). Many applications have been studied
in classification and methods based on DL provides improvements in
different domains. These domains include object detection [3], action
recognition [4], face and speech recognition [5], semantic segmenta-
tion [4], Computational finance [6] etc. The deep convolutional neural
network (CNN) one of the most popular methods due to its ability
to learn data hierarchical level abstraction through encoding them
on different layers. DL methods have achieved a better classification
performance compared with traditional scene classification methods
in remote sensing domain [7].
NN relies on many layers for complex applications. Therefore, the
size of the network is high and it requires large area of hardware
realization. A fully connected layer is used as the last stage in
different architecture of deep neural network. For example, CNN uses
a fully connected layer in the last stage after convolution and pooling
layers. A fully connected layer size in CNN is approximately 4069
neurons or more. Clearly, a new research direction that focuses on
decreasing the neural network size while retaining the precision is
needed. The focus of this paper is to reduce the hardware size of
such big NN.
The work in [8] presented a time division multiplexing of a
communication protocol for neuromorphic system to minimize the
physical number of interconnects between neurons. The time division
multiplexing is based on a single channel to minimize the intercon-
nection cost and processing time. Their proposed method presented
better energy efficiency and lower implementation complexity in
terms of interconnects. The analytical and numerical results shows
40% lower energy consumption in interconnect in a network of
1024 neurons. The work in [9] proposed two approaches to reuse
resources in feed-forward neural network which are coalescing and
folding. In coalescing approach, a single stack of neurons performs
both feature extraction task and classification task using shared
resources. In folding approach, in a high-dimension feed-forward
layer, the neurons are folded to execute multiple tasks. Also, it can
be combined with low precision modules. The proposed techniques
are tested for classification task on binary and multi-class (MNIST)
dataset. The simulation results show the power consumption is 3.65
mW and classification accuracy is 91.2% for MNIST. The work in
[10] presented an approach for tracking a mobile system using an
artificial neural network. Data from the environment is collected by
the mobile system through an ultrasonic transmitter and receiver then
a binary artificial neural network processes the data. The method
is implemented on SOC Xilinx FPGA Zync7000 which includes
hardware neural network and processor which contains interfaces.
The simulation results show the system can identify the position of
a track in less than 1 us.
One of the main problems in neural network architecture is area
overhead. Some applications require many layers to achieve certain
performance in image recognition, speech recognition, decision-
making, etc. This paper presents an approach of hardware neural
network with lower hardware area by using a neural network archi-
tecture using N/2 hardware layers to attain the same performance as
a neural network architecture using N layers. The implementation of
the proposed hardware node is not the same as the traditional node.
The proposed approach saves 42% area overhead compared with the
network using N layers. Reducing the area will help to reduce the
cost of hardware.
The rest of the paper is organized as follows. Section II presents
the proposed approach and its architecture. Section III presents the
implementation of the proposed approach and the simulation results
of the experimental tests. The conclusion is presented in IV.
II. PROPOSED NEURAL NETWORK ARCHITECTURE
An NN is based on a collection of nodes (neurons) and each
connection between nodes can transmit a signal from one to another
as shown in Fig.1. Each input is multiplied by a weight then the
result feeds the equivalent of a cell body. The weighted signals are
745 978-1-5386-9562-3/18/$31.00 ©2018 IEEE