International Journal of Scientific & Engineering Research Volume 8, Issue 6, June-2017 1155 ISSN 2229-5518 IJSER © 2017 http://www.ijser.org Building Sentiment analysis Model using Graphlab First Mona Mohamed Nasr, Second Essam Mohamed Shaaban, and Third Ahmed Mostafa Hafez Abstract Sentiment analysis is called opinion mining which is the field of study that analyzes people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes. Starting from the importance of the sentiment analysis generally for individuals and more specifically for gigantic organizations, we started digging in this paper. Graphlab was used to build the sentiment models. Many algorithms were used along with text features selection techniques to predict the positive and negative sentiments like “SVM”, “logistic regression” and “boosted trees”. The mentioned classifiers were applied to a Hotel reviews dataset got from Trip Advisor website to emulate real customer opinions. The results showed that using SVM classifier along with N-grams features selection technique was superior to others. Keywords—Classification, Feature Selection, Support Vector Machine (SVM), Logistic Regression, Decision trees. —————————— —————————— 1 INTRODUCTION He revolution of social media, e.g.(reviews, forum discussions, blogs, microblogs, Twitter, and social networks)makes it easy to know the reviews of any product. Hence the need for analyzing sentiments (reviews) has emerged.Sentiment analysis, also called opinion mining, is the field of study that analyzes people’s opinions, sentiments, evaluations, appraisals, attitudes, and emotions towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes [1].In recent years many researchers built sentiment models to analyze product reviews and classify them to positive and negative sentiments. Ortigosa et al[2]proposed a hybrid approach that combines lexical- based and machine-learning techniques. The results showed that it is feasible to perform sentiment analysis in Facebook with high accuracy (83.27%).Parkhe and Biswas[3] focused on aspect-based sentiment analysis of movie reviews in order to find out the aspect specific driving factors. These factors are the score given to various movie aspects and generally, aspects with high driving factors direct the polarity of the review the most. They depend on Lexicons, POS, A Naïve Bayes and SVM classifier. The results showed that by giving high driving factors to Movie, Acting and Plot aspects of a movie, we obtained the highest accuracy in the analysis of movie reviews about 79.372%.Nagamma et al[4]applied sentiment analysis for studying the relationship between the online reviews for a movie and the movies box office revenue performance. They useda hybrid approach that combines Term Frequency (TF) and Inverse Document Frequency (IDF) values as features along with Fuzzy Clustering and Support Vector Machine (SVM) Classifier for predicting the trend of the box office revenue from the review sentiment. The results showed that using reviews based on clustering has helped to show an improvement in the accuracy from 62% to 89.65% on SVM classifier with and without clustering. While using NB classifier gave an accuracy of 72.41% under both conditions.Hegde & Padma[5]applied a case study of Kannada SA for mobile product reviews .they used a lexicon-based method for aspect extraction. Furthermore, the Naive Bayes classification model is applied to analyze the polarity of the sentiment due to its computational simplicity and stochastic robustness. Therefore, a customized corpus has been developed. Their preliminary results indicate that this approach is an efficient Technique performed with 65 % accuracy for Kannada SA. In this paper sentiment model was built by using SVM, Decision trees, and Logistic Regression depending on Hotel reviews dataset crawled from Trip Advisorafter applying some modification and transformation from web form to CSV form. All models were built by using IPython Notebook with Graphlab module and SFrame package. The results show that the Sentiment Model-based SVM with N- grams features is superior toothers. 2 IMPLEMENTATION PACKAGE During the implementation phase; Ipython notebook with GraphlabCreate are used to scale much larger data than other available resources like Pandas. T IJSER