Bearish-Bullish Sentiment Analysis on Financial Microblogs Amna Dridi 1 , Mattia Atzeni 1 , and Diego Reforgiato Recupero 1 University of Cagliari, Mathematics and Computer Science Department, Via Ospedale 72, 09124, Cagliari, Italy {amna, diego.reforgiato}@unica.it Abstract. User-generated data in blogs and social networks has re- cently become a valuable resource for sentiment analysis in the ﬁnancial domain since it has been shown to be extremely signiﬁcant to marketing research companies and public opinion organizations. In this paper a ﬁne-grained approach is proposed to predict a real-valued sentiment score. We use several feature sets consisting of lexical features, semantic features and combination of lexical and semantic features. To evaluate our approach a microblog messages dataset is used. Since our dataset includes conﬁdence scores of real numbers within the [0-1] range, we compare the performance of two learning methods: Random Forest and SVR. We test the results of the training model boosted by semantics against classiﬁcation results obtained by n-grams. Our results indicate that our approach succeeds in performing the accuracy level of more than 72% in some cases. 1 Introduction Sentiment analysis in ﬁnancial domain is becoming more and more a big concern for businesses, organizations and marketing researchers, mainly due to their high subjectivity as users express freely their opinions through opinionated sentences, contrary to news articles which are known by their objectivity and implicit opinions [1]. Both lexicon-based [2, 3] and machine learning methods [4, 1] have been used for mining user’s opinion in the ﬁnancial domain. Most of lexicon-based methods have focused on the coarse-grained analysis of sentiment expressed in text. How- ever, coarse-grained methods are insuﬃcient for the detection and polarity clas- siﬁcation of sentiment expressed about companies in ﬁnancial news text as not all expressions of sentiment are related to the company we are interested in [5]. To tackle this problem, machine learning techniques have been recently pro- posed [1, 5, 6] that mainly investigated ﬁne-grained schema to allow pinpointing the particular phrases in a text express sentiment and analyzing these sentiment expressions in a ﬁne-grained manner. Both approaches of research in sentiment analysis in the ﬁnancial domain are still too much focused on word occurrence methods and they seldom even use WordNet [7], ignoring consequently advancements of techniques in semantics.