Understanding SMS Spam in a Large Cellular Network: Characteristics, Strategies and Defenses Nan Jiang 1 , Yu Jin 2 , Ann Skudlark 2 , and Zhi-Li Zhang 1 1 University of Minnesota, Minneapolis, MN, {njiang,zhzhang}@cs.umn.edu 2 AT&T Labs, Florham Park, NJ, {yjin,aes}@research.att.com Abstract. In this paper, using a year (June 2011 to May 2012) of user reported SMS spam messages together with SMS network records collected from a large US based cellular carrier, we carry out a comprehensive study of SMS spamming. Our analysis shows various characteristics of SMS spamming activities, such as spamming rates, victim selection strategies and spatial clustering of spam num- bers. Our analysis also reveals that spam numbers with similar content exhibit strong similarity in terms of their sending patterns, tenure, devices and geoloca- tions. Using the insights we have learned from our analysis, we propose several novel spam defense solutions. For example, we devise a novel algorithm for de- tecting related spam numbers. The algorithm incorporates user spam reports and identifies additional (unreported) spam number candidates which exhibit similar sending patterns at the same network location of the reported spam number dur- ing the nearby time period. The algorithm yields a high accuracy of 99.4% on real network data. Moreover, 72% of these spam numbers are detected at least 10 hours before user reports. 1 Introduction The past decade has witnessed an onslaught of unsolicited SMS (Short Message Ser- vice) spam [1] in cellular networks. The volume of SMS spam has risen 45% in the US in 2011 to 4.5 billion messages and, in 2012, more than 69% of the mobile users claimed to have received text spam [2]. In addition to bringing an annoying user ex- perience, these SMS spam often entice users to visit certain (fraud) websites for other illicit activities, e.g., to steal personal information or to spread malware apps, which can inflict financial loss to the users. At the same time, the huge amount of spam messages also concerns the cellular carriers as the messages traverse through the network, causing congestion and hence degraded network performance. Although akin to traditional email spam, SMS spam exhibit unique characteristics which render inapplicable classical email spam filtering methods. Unlike emails which are generally stored on servers and wait for users to retrieve them, SMS messages are delivered instantly to the recipients through the Signaling System 7 (SS7) network, leaving little time for cellular carriers to react to spam. Meanwhile, high operation cost also limits applying sophisticated spam filters which rely on inspecting SMS message content. Filtering SMS spam at end user devices (e.g., using mobile apps) is also not a fea- sible solution given many SMS capable devices (e.g., feature phones) do not support