Journal of Advances in Technology and Engineering Studies JATER
2016, 2(5): 156-163
PRIMARY RESEARCH
Batch size for training convolutional neural networks for sentence classiϐication
Nabeel Zuhair Tawfeeq Abdulnabi
1*
, Oğuz Altun
2
1, 2
Computer Engineering Department, Yildiz Technical University, Istanbul, Turkey
Index Terms
Convolutional Neural Network
Sentence Classiϐication
Word embedding
Received: 3 August 2016
Accepted: 10 September 2016
Published: 27 October 2016
Abstract— Sentence classiϐication of shortened text such as single sentences of movie review is a hard
subject because of the limited ϐinite information that they normally contain. We present a Convolutional
Neural Network (CNN) architecture and better hyper-parameter values for learning sentence classiϐication
with no preprocessing on small sized data. The CNN used in this work have multiple stages. First the in-
put layer consist of sentence concatenated word embedding. Then followed by convolutional layer with
different ϐilter sizes for learning sentence level features, followed by max-pooling layer which concatenate
features to form ϐinal feature vector. Lastly a softmax classiϐier is used. In our work we allow network to
handle arbitrarily batch size with different dropout ratios, which is gave us an excellent way to regularize
our CNN and block neurons from co-adapting and impose them to learn useful features. By using CNN with
multi ϐilter sizes we can detect speciϐic features such as existence of negations like “not amazing”. Our ap-
proach achieves state-of-the-art result for sentence sentiment prediction in both binary positive/negative
classiϐication.
©2016 TAF Publications. All rights reserved.
I. INTRODUCTION
In Natural Language Processing (NLP), most of the
work with deep learning are deal with learning word vec-
tor embedding by using a neural network [1].
However, CNN which is a neural network that shares
their parameter across space, considered to be responsi-
ble for a major breakthrough in sentence classiϐication. Re-
cently researchers started to employ CNN in NLP and they
got promising results, especially in sentence classiϐication
[2].
However, in order to get a better understanding of
CNN we have to think of it as a sliding window deployed
to a matrix. And the shared by the computation units in
the same layer. This weight shared enables learning valu-
able features regardless of their location, while preserving
of their location where do beneϐicial features appear.
However, CNN for sentence classiϐication considers
quite powerful because it learns the way to weight individ-
ual words in a ϐixed size in order to produce useful features
for a speciϐic task.
II. RELATED WORK
Recently, [3] presented away to train simple convo-
lutional neural networks with one layer of the convolution
on top of word vectors obtained from unsupervised neural
language model, to save word vectors static and try to learn-
ing the other hyperparameters of the model. However, they
found that features extracted obtained from a pre-trained
deep learning model get a good result in the different task.
[4] introduced an excellent study on character-level Con-
vNet for text classiϐication. Also, make comparisons be-
tween ConvNet and against traditional methods like a bag
of the words and n-gram. However, his result illustrate that
character-level of convolutional neural networks is an efϐi-
cient method. On the other hand [5] reported a good way to
model short texts using semantic clustering and CNN. They
ϐind that model uses pre-trained word embedding will in-
troduce extra knowledge, and multi-scale SUs in short texts
are detected. [6] introduced a model to capture both seman-
tic and sentiment similarities among words, The semantic
component of their model learns word vectors via an-
*
Corresponding author: Nabeel Zuhair Tawfeeq Abdulnabi
†
Email: nabil78.nz@gmail.com