Fine-grained Sentiment Classification using BERT Manish Munikar * , Sushil Shakya † and Aakash Shrestha ‡ Department of Electronics and Computer Engineering Pulchowk Campus, Institute of Engineering, Tribhuvan University Lalitpur, Nepal * 070bct520@ioe.edu.np, † 070bct547@ioe.edu.np, ‡ 070bct501@ioe.edu.np Abstract—Sentiment classification is an important process in understanding people’s perception towards a product, service, or topic. Many natural language processing models have been proposed to solve the sentiment classification problem. However, most of them have focused on binary sentiment classification. In this paper, we use a promising deep learning model called BERT to solve the fine-grained sentiment classification task. Experiments show that our model outperforms other popular models for this task without sophisticated architecture. We also demonstrate the effectiveness of transfer learning in natural language processing in the process. Index Terms—sentiment classification, natural language pro- cessing, language model, pretraining I. I NTRODUCTION Sentiment classification is a form of text classification in which a piece of text has to be classified into one of the predefined sentiment classes. It is a supervised machine learn- ing problem. In binary sentiment classification, the possible classes are positive and negative. In fine-grained sentiment classification, there are five classes (very negative, negative, neutral, positive, and very positive). Fig 1 shows a black-box view of a fine-grained sentiment classifier model. Review text Sentiment label (0, 1, 2, 3, 4) Sentiment Classifier Fig. 1. High-level black-box view of a sentiment classifier showing its input and output. Sentiment classification model, like any other machine learning model, requires its input to be a fixed-sized vector of numbers. Therefore, we need to convert a text—sequence of words represented as ASCII or Unicode—into a fixed- sized vector that encodes the meaningful information of the text. Many statistical and deep learning NLP models have been proposed just for that. Recently, there has been an explosion of developments in NLP as well as other deep learning architectures. While transfer learning (pretraining and finetuning) has become the de-facto standard in computer vision, NLP is yet to utilize this concept fully. However, neural language models such as word vectors [1], paragraph vectors [2], and GloVe [3] have started the transfer learning revolution in NLP. Recently, Google researchers published BERT (Bidirectional Encoder Representations from Transformers) [4], a deep bidirectional language model based on the Transformer architecture [5], and advanced the state-of-the-art in many popular NLP tasks. In this paper, we use the pretrained BERT model and fine- tune it for the fine-grained sentiment classification task on the Stanford Sentiment Treebank (SST) dataset. The rest of the paper is organized into six sections. In Section II, we mention our motivation for this work. In Section III, we discuss related works. In Section IV, we describe the dataset we performed our experiments on. We explain our model architecture and methodology in detail in Section V. Then we present and analyze our results in Section VI. Finally, we provide our concluding remarks in Section VII. II. MOTIVATION We have been working on replicating the different research paper results for sentiment analysis, especially on the fine- grained Stanford Sentiment Treebank (SST) dataset. After the popularity of BERT, researchers have tried to use it on different NLP tasks, including binary sentiment classification on SST-2 (binary) dataset, and they were able to obtain state-of-the-art results as well. But we haven’t yet found any experimentation done using BERT on the SST-5 (fine-grained) dataset. Because BERT is so powerful, fast, and easy to use for downstream tasks, it is likely to give promising results in SST-5 dataset as well. This became the main motivation for pursuing this work. III. RELATED WORK Sentiment classification is one of the most popular tasks in NLP, and so there has been a lot of research and progress in solving this task accurately. Most of the approaches have focused on binary sentiment classification, most probably because there are large public datasets for it such as the IMDb movie review dataset [6]. In this section, we only discuss some significant deep learning NLP approaches applied to sentiment classification. The first step in sentiment classification of a text is the embedding, where a text is converted into a fixed-size vector. Since the number of words in the vocabulary after tokenization and stemming is limited, researchers first tackled the problem of learning word embeddings. The first promising language arXiv:1910.03474v1 [cs.CL] 4 Oct 2019