A Systematic Evaluation of Shallow Convolutional Neural Network on CIFAR Dataset Reza Fuad Rachmadi, Member, IAENG, I Ketut Eddy Purnama, Member, IAENG, Mauridhi Hery Purnomo, Member, IAENG, Mochamad Hariadi, Member, IAENG, Abstract—Convolutional Neural Network (CNN) classiﬁer is a very popular classiﬁer used to solve many problems, including image classiﬁcation and object recognition. The CNN classiﬁer usually improved by designing a deeper and bigger classiﬁer which needs more memory and computational power to run the classiﬁer. In this paper, we analyze and optimize the use of small and shallow CNN classiﬁer on CIFAR dataset. Karpathy ConvNetJS CIFAR10 model was used as a base network of our classiﬁer and extended by adding max-min pooling method. The max-min pooling is used to explore the negative and positive response of the convolution process which in theory will be trained the classiﬁer more effectively. We choose several different conﬁgurations to analyze the effectiveness of the classiﬁer by combining the training algorithm, batch normal- ization conﬁguration, weights initialization methods, dropout regularization conﬁguration, and heavy data augmentation. To ensure that the classiﬁer we designed is still small and shallow CNN classiﬁer, we limit the maximum number of layers in our CNN classiﬁer to 15 layers. Experiments on CIFAR10 and CIFAR100 dataset shows that by compacting the kernel on each layer, the classiﬁer can achieve good accuracy and comparable with another state-of-the-art classiﬁer with a relatively same number of layers with an error rate of 6.99% on the CIFAR10 dataset and 29.41% on the CIFAR100 dataset. Index Terms—shallow CNN classiﬁer, max-min pooling, deep convolutional neural network, CIFAR dataset I. I NTRODUCTION C ONVOLUTIONAL Neural Network (CNN) classiﬁer is very famous classiﬁer used for many applications including image classiﬁcation [1]–[7], video analysis [8]– [13], text analysis [14]–[16], and sound analysis [17]–[20]. The era of CNN was started when Krizhevsky et al. [2] won the ILSVRC (ImageNet Large Scale Visual Recognition Challenge) in 2012 with large margin compared with hand- crafted features approaches. After Krizhevsky et al. CNN ap- proach, the researcher starts developing a bigger and deeper CNN classiﬁer to achieve better evaluation score. Some of the CNN architectures approaches, including VGGNet [3], [21], inception network [4], [5], residual network (ResNet) Manuscript received May 17, 2018; revised Jan 28, 2019. This work was partially supported by P3I (Program Percepatan Publikasi Internasional) grants, Institut Teknologi Sepuluh Nopember, Surabaya. In- donesia. Reza Fuad Rachmadi is with Department of Computer Engineering, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia (email: fuad@its.ac.id). I Ketut Eddy Purnama is with Department of Computer Engineering, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia (email: ketut@te.its.ac.id). Mauridhi Hery Purnomo is with Department of Computer Engineering, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia (email: hery@te.its.ac.id). Mochamad Hariadi is with Department of Computer Engineering, In- stitut Teknologi Sepuluh Nopember (ITS), Surabaya, Indonesia (email: hariadi@te.its.ac.id). [6], and DenseNet [7]. The disadvantages of all modern CNN architectures are the huge number of parameters of the classiﬁer which leads to huge memory required to perform training and testing of the classiﬁer. Many parameters of modern CNN architecture are inactive due to the use of the regularization method (such as dropout), bad weights initial- ization, or because of the use of ReLU (Rectiﬁed Linear Unit) activation function. The ReLU activation function is widely used in the modern CNN architectures and proved to be very effective, but in the other hand, the ReLU activation function also disrupted the ﬂow of gradient in the backpropagation process which will make many parameters of the classiﬁer inactive. Blot et al. [22] proposed a new pooling method by exploring either negative and positive response of the output of the convolution process called max-min pooling. Blot et al. [22] proved that the max-min pooling method can reduce the problems that appear when using ReLU activation function and produces a better accuracy compared with the classiﬁer that uses ReLU activation function. In this paper, we investigated several shallow CNN ar- chitecture for image classiﬁcation on CIFAR dataset. The classiﬁer is designed by exploiting either negative or positive output of convolution process by using max-min pooling method [22]. Our contributions can be listed as follows. • We proposed a CNN architecture that exploiting either negative or positive respond of convolution output. The proposed classiﬁer designed using max-min pooling method with additional normalization layer to reduce the number of parameters of the classiﬁer. By using additional normalization layer, the number of parame- ters in the classiﬁer is reduced half compared with the classiﬁer using original max-min pooling method. • We investigated several different CNN conﬁguration for initial experiments and choose one of the conﬁgurations as the main CNN architecture for our proposed classi- ﬁer. • We investigated the effects of different weights ini- tialization, regularization, and heavy data augmentation method using the main CNN architecture chosen in the initial experiments. The rest of the paper organized as follows. Section 2 de- scribes related work on CNN classiﬁer development. The design aspect of our proposed CNN architecture which based on Karpathy ConvNetJS CIFAR10 model is discussed in section 3. Experiments on CIFAR10 and CIFAR100 dataset are discussed in section 4, 5, and 6. In the last section, we summarize and conclude the experiments. IAENG International Journal of Computer Science, 46:2, IJCS_46_2_24 (Advance online publication: 27 May 2019) ______________________________________________________________________________________