DAEDALUS at RepLab 2012: Polarity Classification and Filtering on Twitter Data Julio Villena-Román 1,2 , Sara Lana-Serrano 3,1 , Cristina Moreno-García 1 , Janine García-Morera 1 , José Carlos González-Cristóbal 3,1 1 DAEDALUS - Data, Decisions and Language, S.A. 2 Universidad Carlos III de Madrid 3 Universidad Politécnica de Madrid jvillena@daedalus.es, slana@diatel.upm.es, cmoreno@daedalus.es, jgarcia@daedalus.es, josecarlos.gonzalez@upm.es Abstract. This paper describes our participation at the RepLab 2012 profiling scenario, in both polarity classification and filtering subtasks. Our approach is based on 1) the information provided by a semantic model that includes rules and resources annotated for sentiment analysis, 2) a detailed morphosyntactic analysis of the input text that allows to lemmatize and divide the text into segments to be able to control the scope of semantic units and perform a fine- grained detection of negation in clauses, and 3) the use of an aggregation algorithm to calculate the global polarity value of the text based on the local polarity values of the different segments, which includes an outlier filter. The system, experiments and results are presented and discussed in the paper. Keywords: RepLab, CLEF, reputation analysis, profiling scenario, filtering, polarity classification, sentiment analysis, STILUS. 1 Introduction According to Merriam-Webster dictionary 1 , reputation is the overall quality or character of a given person or organization as seen or judged by people in general, or, in other words, the general recognition by other people of some characteristics or abilities for a given entity. In turn, reputation analysis is the process of tracking, investigating and reporting an entity’s actions and other entities’ opinions about those actions. It covers many factors to calculate the market value of reputation. Reputation analysis has come into wide use as a major factor of competitiveness in the increasingly complex marketplace of personal and business relationships among people and companies. From the technology perspective, the first step towards the automatic reputation analysis is a sentiment analysis, i.e., the application of natural language processing and text analytics to identify and extract subjective information from texts about the sentiments, emotions or opinions contained. 1 http://www.merriam-webster.com/