A Systematic Evaluation of Shallow Convolutional Neural Network on CIFAR Dataset Reza Fuad Rachmadi, Member, IAENG, I Ketut Eddy Purnama, Member, IAENG, Mauridhi Hery Purnomo, Member, IAENG, Mochamad Hariadi, Member, IAENG, Abstract—Convolutional Neural Network (CNN) classifier is a very popular classifier used to solve many problems, including image classification and object recognition. The CNN classifier usually improved by designing a deeper and bigger classifier which needs more memory and computational power to run the classifier. In this paper, we analyze and optimize the use of small and shallow CNN classifier on CIFAR dataset. Karpathy ConvNetJS CIFAR10 model was used as a base network of our classifier and extended by adding max-min pooling method. The max-min pooling is used to explore the negative and positive response of the convolution process which in theory will be trained the classifier more effectively. We choose several different configurations to analyze the effectiveness of the classifier by combining the training algorithm, batch normal- ization configuration, weights initialization methods, dropout regularization configuration, and heavy data augmentation. To ensure that the classifier we designed is still small and shallow CNN classifier, we limit the maximum number of layers in our CNN classifier to 15 layers. Experiments on CIFAR10 and CIFAR100 dataset shows that by compacting the kernel on each layer, the classifier can achieve good accuracy and comparable with another state-of-the-art classifier with a relatively same number of layers with an error rate of 6.99% on the CIFAR10 dataset and 29.41% on the CIFAR100 dataset. Index Terms—shallow CNN classifier, max-min pooling, deep convolutional neural network, CIFAR dataset I. I NTRODUCTION C ONVOLUTIONAL Neural Network (CNN) classifier is very famous classifier used for many applications including image classification [1]–[7], video analysis [8]– [13], text analysis [14]–[16], and sound analysis [17]–[20]. The era of CNN was started when Krizhevsky et al. [2] won the ILSVRC (ImageNet Large Scale Visual Recognition Challenge) in 2012 with large margin compared with hand- crafted features approaches. After Krizhevsky et al. CNN ap- proach, the researcher starts developing a bigger and deeper CNN classifier to achieve better evaluation score. Some of the CNN architectures approaches, including VGGNet [3], [21], inception network [4], [5], residual network (ResNet) Manuscript received May 17, 2018; revised Jan 28, 2019. This work was partially supported by P3I (Program Percepatan Publikasi Internasional) grants, Institut Teknologi Sepuluh Nopember, Surabaya. In- donesia. Reza Fuad Rachmadi is with Department of Computer Engineering, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia (email: fuad@its.ac.id). I Ketut Eddy Purnama is with Department of Computer Engineering, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia (email: ketut@te.its.ac.id). Mauridhi Hery Purnomo is with Department of Computer Engineering, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia (email: hery@te.its.ac.id). Mochamad Hariadi is with Department of Computer Engineering, In- stitut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia (email: hariadi@te.its.ac.id). [6], and DenseNet [7]. The disadvantages of all modern CNN architectures are the huge number of parameters of the classifier which leads to huge memory required to perform training and testing of the classifier. Many parameters of modern CNN architecture are inactive due to the use of the regularization method (such as dropout), bad weights initial- ization, or because of the use of ReLU (Rectified Linear Unit) activation function. The ReLU activation function is widely used in the modern CNN architectures and proved to be very effective, but in the other hand, the ReLU activation function also disrupted the flow of gradient in the backpropagation process which will make many parameters of the classifier inactive. Blot et al. [22] proposed a new pooling method by exploring either negative and positive response of the output of the convolution process called max-min pooling. Blot et al. [22] proved that the max-min pooling method can reduce the problems that appear when using ReLU activation function and produces a better accuracy compared with the classifier that uses ReLU activation function. In this paper, we investigated several shallow CNN ar- chitecture for image classification on CIFAR dataset. The classifier is designed by exploiting either negative or positive output of convolution process by using max-min pooling method [22]. Our contributions can be listed as follows. • We proposed a CNN architecture that exploiting either negative or positive respond of convolution output. The proposed classifier designed using max-min pooling method with additional normalization layer to reduce the number of parameters of the classifier. By using additional normalization layer, the number of parame- ters in the classifier is reduced half compared with the classifier using original max-min pooling method. • We investigated several different CNN configuration for initial experiments and choose one of the configurations as the main CNN architecture for our proposed classi- fier. • We investigated the effects of different weights ini- tialization, regularization, and heavy data augmentation method using the main CNN architecture chosen in the initial experiments. The rest of the paper organized as follows. Section 2 de- scribes related work on CNN classifier development. The design aspect of our proposed CNN architecture which based on Karpathy ConvNetJS CIFAR10 model is discussed in section 3. Experiments on CIFAR10 and CIFAR100 dataset are discussed in section 4, 5, and 6. In the last section, we summarize and conclude the experiments. IAENG International Journal of Computer Science, 46:2, IJCS_46_2_24 (Advance online publication: 27 May 2019) ______________________________________________________________________________________