International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1471 Sentiment Analysis and Classification of Tweets Using Data Mining Md Shoeb 1 , Jawed Ahmed 2 1 Student, Department of computer science, Hamdard University, New Delhi, India 2 Assistant Professor, Department of computer science, Hamdard University, New Delhi, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - These days, Social networking sites like twitter, facebook, etc. are the great source of communication for internet users. So these become an important source for understanding the opinions, views or emotions of people. In this paper, we use data mining techniques for the purpose of classification to perform sentiment analysis on the views people have shared on Twitter, which is one of the most used social networking sites nowadays. We collect dataset, i.e. tweets from Twitter and apply text mining techniques – transformation, tokenization, stemming etc to convert them into a useful form and then use it for building sentiment classifier. Rapid Miner tool is being used, that helps in building the classifier. Here, we are using three different classifiers on the data and then compare the results to find which one gives better accuracy and better results. Key Words: Rapid Miner, Classification, data mining, sentiment analysis 1. INTRODUCTION In recent times, people are using social networking sites like twitter, facebook, blogs for expressing their sentiments, views, feedbacks, opinions etc. and the opinions of other people have always been important to us in many ways. So, there comes a need to analyze their views and sentiments. Sentiment Analysis is the implementation of natural language processing, text analytics, and computational linguistics that assists in recognizing and extracting the useful information from the source matter[1]. It aims to ascertain the point of view of a speaker or a writer towards any topic or incident by analyzing their comments on social networking sites. Data mining also called knowledge discovery in databases that means the complete process of discovering the beneficial knowledge from data. It is the process of obtaining attractive and serviceable designs and relationships in large volumes of data[2]. Data classification is the process of classifying the data into some categories for its most efficacious and productive use. The goal of the classification is to predict the target class accurately for each and every case in the data. An algorithm that specially used to implements classification is known as a classifier. The term "classifier" sometimes also refers to the mathematical function, that is implemented by a classification algorithm. Text mining is the analysis of the data being used to extract the useful data from. It is used to process textual information and extract meaningful data from the text. Generally, some natural language processing or information retrieval methods or some pre-processing of text is done in order to make it useful for applying data mining algorithms. In this work, we are using three different classifiers to extract the thoughts and sentiments of the people, they share on twitter through their tweets and classify them into different categories. And compare the results to find out which classifier gives the best result in terms of better precision and recall ratios and accuracy. 2. RELATED WORKS This section contains a review of the work previously done in the field of sentiment analysis for the live data. A lot of work has been carried out till date in this field for the data from the users on social media in order to extract the sentiments of people towards any topic, products, trend etc. The studies focus on extracting useful information from the natural language of users and process it to get the real sentiments from the language. Osaimi and Badruddin[3] have done a lot of work on the sentiment analysis of the tweets on the twitter in the Arabic language. In this, they build different classifiers by training them with a proper dataset and then analyzed the accuracy and result of these classifiers in order to predict the correct sentiments. Pragya Tripathi, Santosh Kr Vishwakarma, Ajay Lala[4] have proposed the work on the sentiment analysis of English tweets using rapid minor. They collect the dataset from the twitter that is in natural language and applies the techniques of text mining and use it to build the sentiment classifier. O’Keefe et al.[ͷ] have proposed a technique to select the features attributes weight and applied two classifiers on it i.e. Naïve Bayes and SVM. In this work, the author obtained classification accuracy of 87.15% by using only 29% of the selected attributes. Pak and Paroubek [6] have also worked in this field. The author used the data of Twitter to perform linguistic analysis and then build a classifier that is highly efficient. Pang and Lee[7] presented the broad overview of the existing work done by Pak and Paroubek. The authors describe the existing techniques and approaches for an information retrieval, in their survey. K.Bhuvaneswari and R. Parimala[8] have proposed in their work, a method for sentiment classification using correlation-based feature selection. They applied different data pre-processing techniques, then used a correlation attribute method for feature selection, and then finally two classifiers namely Naïve Bayes and Support Vector Machine are implemented and results were evaluated. Farhan Laeeq, Md. Tabrez Nafis and Mirza Rahil Beg[9] have proposed a work on sentiment classification of social media. In their