Towards bots detection by analyzing the behavior of user data on Twitter Francisco Moo-Mena 1 , Sofía Robles-Sandoval 2 , Karina González-Magaña 3 , and Oliver Rodríguez-Adame 4 1 Facultad de Matemáticas, Universidad Autónoma de Yucatán Mérida, Yuc., Mexico 2 Universidad de Guadalajara Guadalajara, Jal., Mexico 3 Instituto Tecnológico de Ciudad Guzmán Ciudad Guzmán, Jal., Mexico 4 Instituto Tecnológico de Tepic Tepic, Nay., Mexico Abstract Currently, social networks play an important role as a means of communication about various topics. In this way, this medium represents a very important source of data to know the opinions of its users on very diverse topics. However, the opinions expressed in this medium are exposed to the influence of specialized programs called bots. These bots are activated with the idea of influencing positively or negatively towards some point of view of the issues under discussion. When implemented through computer platforms accessible from any medium with Internet access, it is possible to access such content automatically through its APIs. Prior to an analysis of the opinions expressed in the social network, it would be highly recommended, as part of the process of debugging the data, some reliable bot detection mechanism. While there is still no optimized method for this task, this paper proposes a series of directives that can be considered in order to carry it out. As a case study, these directives are implemented on messages retrieved from Twitter, related to opinions about the candidates of the presidential election of Mexico in 2018. Keywords: Bots detection, social network, Twitter, Presidential elections. 1. Introduction Social networks have acquired great relevance in the dissemination of information and ideas, which have made them one more of the dissemination tools used by individuals and corporations. With the increase of information technologies and the rise of social networks, people spend more time on these platforms. Just in July 2018, the average number of daily tweets was 92,006 in Mexico. [1]. This means of communication also represents an excellent option to know the reactions of society to events of any kind. Considering this and taking into account its availability for most sectors of society, social networks have become easy targets for those who seek to manipulate or influence public opinion. Introducing, in this way, points of view and fictitious ideas not expressed by real people or institutions. This intrusion is often done using the so-called bots, which are programs that publish on social networks in an automated way. To the best of our knowledge, there is still no optimized technique for automatic bot detection. Among the most popular social networks are Twitter and Facebook. Through well-defined APIs, it is possible to access user data in these social networks. Given the enormous amount of data generated daily in these social networks, it is interesting to recover and analyze them to know trends in the opinions of millions of users on specific topics. However, to obtain reliable information, it is necessary to detect and eliminate data generated by bots in the data debugging stage. In this work, after review of the literature, a series of measures related to the identification of bots are described. Being the Twitter API one of the most flexible to use, tweets are retrieved from thousands of users and the selected measures are calculated to try to determine which of the analyzed tweets were published by a bot and which were not. It should be noted that this first stage does not seek to define whether the bot is malicious or not. That is, if the bot's goal is to disorient or manipulate public opinion, since there are bots whose purpose is IJCSI International Journal of Computer Science Issues, Volume 16, Issue 1, January 2019 ISSN (Print): 1694-0814 | ISSN (Online): 1694-0784 www.IJCSI.org https://doi.org/10.5281/zenodo.2588241 21 2019 International Journal of Computer Science Issues