DOI: 10.4018/IJIRR.2019010101
International Journal of Information Retrieval Research
Volume 9 • Issue 1 • January-March 2019
Copyright © 2019, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
1
Sentiment Analysis Using Cuckoo
Search for Optimized Feature
Selection on Kaggle Tweets
Akshi Kumar, Delhi Technological University, Delhi, India
Arunima Jaiswal, Indira Gandhi Delhi Technical University for Women, Delhi, India
Shikhar Garg, Delhi Technological University, Delhi, India
Shobhit Verma, Delhi Technological University, Delhi, India
Siddhant Kumar, Delhi Technological University, Delhi, India
ABSTRACT
Selecting the optimal set of features to determine sentiment in online textual content is imperative
for superior classification results. Optimal feature selection is computationally hard task and fosters
the need for devising novel techniques to improve the classifier performance. In this work, the binary
adaptation of cuckoo search (nature inspired, meta-heuristic algorithm) known as the Binary Cuckoo
Search is proposed for the optimum feature selection for a sentiment analysis of textual online content.
The baseline supervised learning techniques such as SVM, etc., have been firstly implemented with
the traditional tf-idf model and then with the novel feature optimization model. Benchmark Kaggle
dataset, which includes a collection of tweets is considered to report the results. The results are assessed
on the basis of performance accuracy. Empirical analysis validates that the proposed implementation
of a binary cuckoo search for feature selection optimization in a sentiment analysis task outperforms
the elementary supervised algorithms based on the conventional tf-idf score.
KeywORdS
Binary Cuckoo Search, Feature Selection, Kaggle, Sentiment Analysis, Swarm Intelligence
INTROdUCTION
The increasing traction of social media avenues to verbalize personal notions & beliefs has created a
need to put in place a paradigm which can analyse the humongous amount of data involved, the task
is typically referred to as sentiment analysis (Kumar & Sharma, 2016). Formally, Sentiment Analysis
is defined as the study, and subsequent categorization, of an individual’s feelings and opinions,
communicated through text, with respect to a certain context (Kumar & Abraham, 2017; Kumar
& Teeja, 2012). The categorization is carried out along the lines of polarities, such as positive and
negative, etc. (Kumar & Sebastian, 2012; Kumar & Sharma, 2017).
Sentiment analysis, also known as opinion mining, is the means of recognizing and designating
opinions communicated through a written piece to ascertain the author’s connotation (positive,
objective or negative) of that piece using a combination of statistical and computational techniques
(Kumar & Jaiswal, 2017).