Journal of Advances in Technology and Engineering Studies JATER 2016, 2(5): 156-163 PRIMARY RESEARCH Batch size for training convolutional neural networks for sentence classiϐication Nabeel Zuhair Tawfeeq Abdulnabi 1* , Oğuz Altun 2 1, 2 Computer Engineering Department, Yildiz Technical University, Istanbul, Turkey Index Terms Convolutional Neural Network Sentence Classiϐication Word embedding Received: 3 August 2016 Accepted: 10 September 2016 Published: 27 October 2016 Abstract— Sentence classiϐication of shortened text such as single sentences of movie review is a hard subject because of the limited ϐinite information that they normally contain. We present a Convolutional Neural Network (CNN) architecture and better hyper-parameter values for learning sentence classiϐication with no preprocessing on small sized data. The CNN used in this work have multiple stages. First the in- put layer consist of sentence concatenated word embedding. Then followed by convolutional layer with different ϐilter sizes for learning sentence level features, followed by max-pooling layer which concatenate features to form ϐinal feature vector. Lastly a softmax classiϐier is used. In our work we allow network to handle arbitrarily batch size with different dropout ratios, which is gave us an excellent way to regularize our CNN and block neurons from co-adapting and impose them to learn useful features. By using CNN with multi ϐilter sizes we can detect speciϐic features such as existence of negations like “not amazing”. Our ap- proach achieves state-of-the-art result for sentence sentiment prediction in both binary positive/negative classiϐication. ©2016 TAF Publications. All rights reserved. I. INTRODUCTION In Natural Language Processing (NLP), most of the work with deep learning are deal with learning word vec- tor embedding by using a neural network [1]. However, CNN which is a neural network that shares their parameter across space, considered to be responsi- ble for a major breakthrough in sentence classiϐication. Re- cently researchers started to employ CNN in NLP and they got promising results, especially in sentence classiϐication [2]. However, in order to get a better understanding of CNN we have to think of it as a sliding window deployed to a matrix. And the shared by the computation units in the same layer. This weight shared enables learning valu- able features regardless of their location, while preserving of their location where do beneϐicial features appear. However, CNN for sentence classiϐication considers quite powerful because it learns the way to weight individ- ual words in a ϐixed size in order to produce useful features for a speciϐic task. II. RELATED WORK Recently, [3] presented away to train simple convo- lutional neural networks with one layer of the convolution on top of word vectors obtained from unsupervised neural language model, to save word vectors static and try to learn- ing the other hyperparameters of the model. However, they found that features extracted obtained from a pre-trained deep learning model get a good result in the different task. [4] introduced an excellent study on character-level Con- vNet for text classiϐication. Also, make comparisons be- tween ConvNet and against traditional methods like a bag of the words and n-gram. However, his result illustrate that character-level of convolutional neural networks is an efϐi- cient method. On the other hand [5] reported a good way to model short texts using semantic clustering and CNN. They ϐind that model uses pre-trained word embedding will in- troduce extra knowledge, and multi-scale SUs in short texts are detected. [6] introduced a model to capture both seman- tic and sentiment similarities among words, The semantic component of their model learns word vectors via an- * Corresponding author: Nabeel Zuhair Tawfeeq Abdulnabi † Email: nabil78.nz@gmail.com