Rule-based approach to sentiment analysis at ROMIP 2011 Dmitry Kan AlphaSense Inc dmitry.kan@gmail.com Abstract. This paper describes rule-based approach to sentiment analysis, that aims at shallow parsing of an input text in the Russian language and applying a set of linguistic rules for resolving a sentiment of a given chunk (subclause, sentence or text). The algorithm shows decent perfomance (90% precision for positive class) for the cases when annotators agreed on a sentiment label and has the feature of the text object related sentiment classification. Keywords: sentiment classification, sentiment analysis, opinion mining, ROMIP, blog data, rule-based sentiment analysis 1 Introduction Sentiment analysis while being a subtask of artificial intelligence (or more strictly, Natural Language Processing) remains rather vagualy defined. This is supported by a cross-annotator agreement levels, which maximize at around 80% [1]. This can in a way be considered as a target level for sentiment detection accuracy, starting from which further improvements require more fine-grained tuning of algorithms. Partially, complexity of the task definition refers to how each human annotator perceives a sentiment expressed in an utterance depending on his/her own current mood, personal attitude to utterance’s subject, annotation’s goal defined. Rule-based algorithms bring a good level of clarity which levereges the ability to answer simple question: why did the system classified given chunk to a certain sentiment class? The paper is organized as follows. Section 2 presents the rule-based algorithm for the Russian language that can classify to 2 (negative and positive) and 3 (negative, neutral and positive) classes. In Section 3 we give a break down by each metrics in relation to other participant systems. Section 4 concludes the paper and outlines open problems to be solved. 2 Rule-based linguistic algorithm The aim of this approach is shallow parsing and is twofold: high speed of processing and flexibility to unknown, potentintially, spoken utterances. It is worth to mention, that for rule- based systems the complexity of sentiment detection task rougly grows as follows: (easiest) sentiment of a subclause, (moderate) sentiment of a sentence, (hardest) sentiment of the entire text.