DOI: 10.4018/IJIRR.2019010101 International Journal of Information Retrieval Research Volume 9 • Issue 1 • January-March 2019 Copyright © 2019, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. 1 Sentiment Analysis Using Cuckoo Search for Optimized Feature Selection on Kaggle Tweets Akshi Kumar, Delhi Technological University, Delhi, India Arunima Jaiswal, Indira Gandhi Delhi Technical University for Women, Delhi, India Shikhar Garg, Delhi Technological University, Delhi, India Shobhit Verma, Delhi Technological University, Delhi, India Siddhant Kumar, Delhi Technological University, Delhi, India ABSTRACT Selecting the optimal set of features to determine sentiment in online textual content is imperative for superior classification results. Optimal feature selection is computationally hard task and fosters the need for devising novel techniques to improve the classifier performance. In this work, the binary adaptation of cuckoo search (nature inspired, meta-heuristic algorithm) known as the Binary Cuckoo Search is proposed for the optimum feature selection for a sentiment analysis of textual online content. The baseline supervised learning techniques such as SVM, etc., have been firstly implemented with the traditional tf-idf model and then with the novel feature optimization model. Benchmark Kaggle dataset, which includes a collection of tweets is considered to report the results. The results are assessed on the basis of performance accuracy. Empirical analysis validates that the proposed implementation of a binary cuckoo search for feature selection optimization in a sentiment analysis task outperforms the elementary supervised algorithms based on the conventional tf-idf score. KeywORdS Binary Cuckoo Search, Feature Selection, Kaggle, Sentiment Analysis, Swarm Intelligence INTROdUCTION The increasing traction of social media avenues to verbalize personal notions & beliefs has created a need to put in place a paradigm which can analyse the humongous amount of data involved, the task is typically referred to as sentiment analysis (Kumar & Sharma, 2016). Formally, Sentiment Analysis is defined as the study, and subsequent categorization, of an individual’s feelings and opinions, communicated through text, with respect to a certain context (Kumar & Abraham, 2017; Kumar & Teeja, 2012). The categorization is carried out along the lines of polarities, such as positive and negative, etc. (Kumar & Sebastian, 2012; Kumar & Sharma, 2017). Sentiment analysis, also known as opinion mining, is the means of recognizing and designating opinions communicated through a written piece to ascertain the author’s connotation (positive, objective or negative) of that piece using a combination of statistical and computational techniques (Kumar & Jaiswal, 2017).