Twitter: Who gets Caught? Observed Trends in Social Micro-blogging Spam Abdullah Almaatouq Center for Complex Engineering Systems at KACST and MIT amaatouq@mit.edu Ahmad Alabdulkareem Center for Complex Engineering Systems at KACST and MIT kareem@mit.edu Mariam Nouh Center for Complex Engineering Systems at KACST and MIT mnouh@kacst.edu.sa Erez Shmueli Massachusetts Institute of Technology (MIT) Media Lab shmueli@mit.edu Mansour Alsaleh King Abdulaziz City for Science and Technology maalsaleh@kacst.edu.sa Vivek K. Singh Massachusetts Institute of Technology (MIT) Media Lab singhv@mit.edu ABSTRACT Spam in Online Social Networks (OSNs) is a systemic problem that imposes a threat to these services in terms of undermining their value to advertisers and potential investors, as well as negatively af- fecting users’ engagement. In this work, we present a unique anal- ysis of spam accounts in OSNs viewed through the lens of their behavioral characteristics (i.e., proﬁle properties and social inter- actions). Our analysis includes over 100 million tweets collected over the course of one month, generated by approximately 30 mil- lion distinct user accounts, of which over 7% are suspended or re- moved due to abusive behaviors and other violations. We show that there exist two behaviorally distinct categories of twitter spammers and that they employ different spamming strategies. The users in these two categories demonstrate different individual properties as well as social interaction patterns. As the Twitter spammers contin- uously keep creating newer accounts upon being caught, a behav- ioral understanding of their spamming behavior will be vital in the design of future social media defense mechanisms. Categories and Subject Descriptors H.0 [Information systems]: General; K.4.2 [Social issues]: Abuse and crime involving computers; H.2.8 [Database Applications]: Data mining Keywords Spam, Online Social Networks, Microblogging, Account Abuse 1. INTRODUCTION Spam exists across many types of electronic communication plat- forms, including email, web discussion forums, text messages (SMS), Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full cita- tion on the ﬁrst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. Request permissions from permissions@acm.org. XX’1, February 22, 2014, XX, XX. Copyright 2013 ACM xxx-x-xxxx-xxxx-0/13/10 ...$15.00. http://dx.doi.org/xxxx/xxxxx.xxxxxxx . and social media. Today, as social media continues to grow in pop- ularity, spammers are increasingly abusing such media for spam- ming purposes. According to a recent study [20], there was a 355% growth in social spam during the ﬁrst half of 2013. Twitter com- pany’s initial public offering (IPO) ﬁling indicates spam as a ma- jor threat in terms of undermining their value to advertisers and potential investors, as well as negatively affecting users’ engage- ment [32]. While there is a growing literature on social media in terms of developing tools for spam detection (e.g., [17, 24, 33]) and analyz- ing spam trends (e.g., [27, 37, 38]), spammers continue to evolve and change their penetration techniques. Therefore, there is a con- tinuous need for understanding the evolving and diverse properties of malicious accounts in order to combat them properly [20, 32]. In this paper, we present an empirical analysis of spam accounts on Twitter, in terms of proﬁle properties and social interactions. The analysis includes identifying categories (sub-populations) of spam accounts (see Section 4). Through proﬁle analysis we iden- tify distinct characteristics and patterns that pertain to different iden- tiﬁed categories of Twitter accounts (see Section 5). We also exam- ine the network properties of several social interactions (namely, follow relationship and mention) to improve our understanding of the methods used by spammers for reaching spam victims (see Sec- tion 6). To perform the study, we collected over 100 million tweets over the course of one month (from March 5, 2013 to April 2, 2013) generated by approximately 30 million distinct user accounts (see Section 3). In total, over 7% of our dataset accounts are suspended or removed accounts due in part to abusive behaviors and other vi- olations. The summary and future work of our study discussed in Section 8. In summary, we frame our contributions as follows: • We categorize spam accounts based on their behavioral ac- tivities and ﬁnd that Twitter spammers belong to two broad behavioral categories. We observe that these categories of spam accounts exhibit different spamming patterns and em- ploy distinct strategies for reaching their victims, and should therefore be analyzed separately and treated differently by future social media defense mechanisms.