Rivista Italiana di Economia Demografia e Statistica Volume LXVIII n. 3/4 Luglio-Dicembre 2014 MACHINE LEARNING AND TEXT MINING TO CLASSIFY TWEETS ON A POLITICAL LEADER Agostino Di Ciaccio, Giovanni M. Giorgi 1. Introduction The Social Network Twitter was created in 2006, but in Italy it has had a slow expansion starting from 2009. Now Twitter is very popular and counts 255 million users, becoming the social media most used by public personalities, showmen, politicians. In Twitter, each user handles his own personal page that can update via text messages, with a maximum length of 140 characters, known as “tweets”. Anyway, the user can add links to pictures, videos, or other documents. The limit on the length of each tweet is, at the same time, the strength and weakness of this social network: with 140 characters you cannot develop a speech, but you can write a sentence quickly using a smartphone. Let’s recall some of the unique aspects of this social network. A Twitter user can choose to follow another user (becoming a “follower”), automatically getting the communication of all his messages. A message may be written independently, or may be a response to someone else's tweet (it is a “reply”). A “retweet” is a message, wrote from another user, on which we full agree and want to promote in the community without altering it in any way. The hashtags are keywords provided by the user in the tweets; a fake user is, usually, a humoristic duplicate of a celebrity, finally an “influencer” is someone who has a large following (cf. Bentivegna, 2014). A key feature of Twitter is that it is an open system, where everyone can read the tweets of other users and participate in a discussion. Many public figures, particularly politicians and showmen, are on Twitter and anyone can write to them directly (but it is unlikely to receive a response). Therefore, Twitter is an important showcase and an inexpensive way to communicate instantly with other users of the social network, bypassing the traditional media (TV, newspapers, radio). In the 2014 European elections, 92% of the italian candidates had a Twitter account. In this paper we will see how to analyze Twitter to get the sentiment towards a political figure and describe the community connected to him, although having to handle millions of tweets.