Sentiment Analysis of Spanish Tweets Using a Ranking Algorithm and Skipgrams ⇤ An´ alisis de sentimientos sobre tweets en castellano utilizando un algoritmo de ranking y skipgrams Javi Fern´ andez, Yoan Guti´ errez, Jos´ e M. G´ omez, Patricio Mart´ ınez-Barco, Andr´ es Montoyo, Rafael Mu˜ noz Departamento de Lenguajes y Sistemas Inform´ aticos, Universidad de Alicante {javifm,ygutierrez,jmgomez,patricio,montoyo,rafael}@dlsi.ua.es Resumen: En este art´ ıculo presentamos nuestra contribuci´ on a la Tarea 1 (clasifi- caci´ on de polaridad en 6 niveles) de la competici´ on TASS 2013. Esta contribuci´ on est´ a formada por dos aproximaciones diferentes: una versi´ on modificada de un algo- ritmo de ranking (RA-SR) utilizando bigramas, y una nueva propuesta que utiliza un puntuador de skipgrams. Estas aproximaciones crean diccionarios de sentimien- tos capaces de mantener el contexto de los t´ erminos. Todas nuestras aproximaciones aparecen en los primeros 10 mejores resultados entre los sistemas presentados a la competici´ on, y la combinaci´ on de ambos consigue llegar a la primera posici´ on. Palabras clave: an´ alisis de sentimientos, miner´ ıa de opiniones, generaci´ on de lexi- cones, aprendizaje autom´ atico, twitter, algoritmo de ranking, skipgrams Abstract: In this paper, we present our contribution for the Task 1 (6 levels po- larity classification) of the TASS 2013 competition. This contribution consists on two di↵erent approaches: a modified version of a ranking algorithm (RA-SR) using bigrams, and new proposal using a skipgrams scorer. These approaches create sen- timent lexicons able to retain the context of the terms. All our approaches appear in the top 10 best results of the systems presented to the competition, and the com- bination of them reaches the first position. Keywords: sentiment analysis, opinion mining, lexicon generation, machine learn- ing, twitter, ranking algorithm, skipgrams 1 Introduction Textual information has become one of the most important sources of data to extract useful and heterogeneous knowledge from. Texts can provide factual information, such as descriptions, lists of features, or even in- structions, and opinion-based information, which would include reviews, emotions, or feelings. This subjective information can be expressed through di↵erent textual genres, such as blogs, forums, and reviews, but also through social networks and microblogs. ⇤ We would like to express our gratitude for the fi- nancial support given by the Department of Software and Computer Systems at the University of Alicante, the Spanish Ministry of Economy and Competitivity (Spanish Government) by the project grants TEXT- MESS 2.0 (TIN2009-13391-C04-01), LEGOLANG (TIN2012-31224), ATTOS (TIN2012-38536-C03-03), SAM (FP7-611312), and the Valencian Government (grant no. PROMETEO/2009/119). Twitter is a microblogging social network that has gained much popularity last years. This service enables its users to send and read text-based messages of up to 140 characters, known as tweets. This site can be a vast source of subjective information in real time; millions of users share opinions on di↵erent aspects of their everyday life. Extracting this subjective information has a great value for both general and expert users. For example, users can find opinions about a product they are interested in, and companies and pub- lic figures can monitor their online reputa- tion. Traditional Sentiment Analysis (SA) can deal with this task; however, it is diffi- cult to exploit it accordingly, mainly because of the short length of the tweets, the infor- mality, and the lack of context. SA systems must be adapted to this face the challenges of this new textual genre.