Bearish-Bullish Sentiment Analysis on Financial Microblogs Amna Dridi 1 , Mattia Atzeni 1 , and Diego Reforgiato Recupero 1 University of Cagliari, Mathematics and Computer Science Department, Via Ospedale 72, 09124, Cagliari, Italy {amna, diego.reforgiato}@unica.it Abstract. User-generated data in blogs and social networks has re- cently become a valuable resource for sentiment analysis in the financial domain since it has been shown to be extremely significant to marketing research companies and public opinion organizations. In this paper a fine-grained approach is proposed to predict a real-valued sentiment score. We use several feature sets consisting of lexical features, semantic features and combination of lexical and semantic features. To evaluate our approach a microblog messages dataset is used. Since our dataset includes confidence scores of real numbers within the [0-1] range, we compare the performance of two learning methods: Random Forest and SVR. We test the results of the training model boosted by semantics against classification results obtained by n-grams. Our results indicate that our approach succeeds in performing the accuracy level of more than 72% in some cases. 1 Introduction Sentiment analysis in financial domain is becoming more and more a big concern for businesses, organizations and marketing researchers, mainly due to their high subjectivity as users express freely their opinions through opinionated sentences, contrary to news articles which are known by their objectivity and implicit opinions [1]. Both lexicon-based [2, 3] and machine learning methods [4, 1] have been used for mining user’s opinion in the financial domain. Most of lexicon-based methods have focused on the coarse-grained analysis of sentiment expressed in text. How- ever, coarse-grained methods are insufficient for the detection and polarity clas- sification of sentiment expressed about companies in financial news text as not all expressions of sentiment are related to the company we are interested in [5]. To tackle this problem, machine learning techniques have been recently pro- posed [1, 5, 6] that mainly investigated fine-grained schema to allow pinpointing the particular phrases in a text express sentiment and analyzing these sentiment expressions in a fine-grained manner. Both approaches of research in sentiment analysis in the financial domain are still too much focused on word occurrence methods and they seldom even use WordNet [7], ignoring consequently advancements of techniques in semantics.