Measuring Network-Aware Worm Spreading Ability Zesheng Chen and Chuanyi Ji School of Electrical & Computer Engineering Georgia Institute of Technology, Atlanta, Georgia 30332 Email: {zchen, jic}@ece.gatech.edu Abstract— This work investigates three aspects: (a) a network vulnerability as the non-uniform vulnerable-host distribution, (b) threats, i.e., intelligent worms that exploit such a vulnerability, and (c) defense, i.e., challenges for fighting the threats. We first study five data sets and observe consistent clustered vulnerable- host distributions. We then present a new metric, referred to as the non-uniformity factor, which quantifies the unevenness of a vulnerable-host distribution. This metric is essentially the Renyi information entropy and better characterizes the non-uniformity of a distribution than the Shannon entropy. We then analytically and empirically measure the infection rate and the propagation speed of network-aware worms. We show that a representative network-aware worm can increase the spreading speed by exactly or nearly a non-uniformity factor when compared to a random- scanning worm at the early stage of worm propagation. This implies that when a worm exploits an uneven vulnerable-host distribution as a network-wide vulnerability, the Internet can be infected much more rapidly. Furthermore, we analyze the effectiveness of defense strategies on the spread of network-aware worms. Our results demonstrate that counteracting network- aware worms is a significant challenge for the strategies that include host-based defense and IPv6. I. I NTRODUCTION Worm scanning has become more and more sophisticated since the initial attacks of Internet worms. Most of the real, especially “old” worms, such as Code Red [12], Slammer [13], and latter Witty [19], exploit naive random scanning that chooses target IP addresses uniformly and does not use any information on network vulnerabilities. Advanced scanning methods, however, have been developed that take the IP address structure into consideration. One example is routable scanning that selects targets only in the routable address space, using the information provided by the BGP routing table [23], [26]. Another example is evasive worms that exploit lightweight sampling to obtain the knowledge of live subnets of the address space and spread only in these networks [16]. This work focuses on a class of network-aware worms. Such worms exploit the information on the highly uneven distri- butions of vulnerable hosts. The vulnerable-host distributions have been observed to be bursty and spatially inhomogeneous by Barford et al. [1]. A non-uniform distribution of Witty- worm victims has been reported by Rajab et al. [15]. We have also found that a Web-server distribution is non-uniform in the IP address space [6]. These discoveries suggest that vulnerable hosts and Web servers may be “clustered” (i.e., non-uniform). The clustering/non-uniformity makes the network vulnerable since if one host is compromised in a cluster, the rest may be compromised rather quickly. In our prior work, we have studied a class of “worst-case” worms, called importance-scanning worms, which exploit non-uniform vulnerable-host distributions [6], [5]. Importance scanning is developed from and named after importance sam- pling in statistics. Importance scanning probes the Internet according to an underlying vulnerable-host distribution. Such a scanning method forces worm scans on the most relevant parts of an address space and supplies the optimal strategy 1 . Importance scanning thus provides a “what-if” scenario: When there are many ways for intelligent worms to exploit such a vulnerability, importance scanning is a worst-case threat- model. Hence, importance scanning can serve as a benchmark for studying real worms. Are there any real network-aware worms? Code Red II and Nimda worms have used localized scanning [28], [29]. Localized scanning preferentially searches for vulnerable hosts in the “local” address space. The Blaster worm has used sequential scanning in addition to localized scanning [31]. Sequential scanning searches for vulnerable hosts through their closeness in the IP address space. It is not well under- stood, however, how to characterize the relationships between vulnerable-host distributions and these network-aware worms. What has been observed is that real network-aware and importance-scanning worms spread much faster than random- scanning worms [15], [6]. This shows the importance of the problem. Does there exist a generic characteristic across dif- ferent vulnerable-host distributions? If so, how do intelligent worms exploit such a vulnerability, and how can we defend against such worms? Our goal is to investigate such a generic characteristic in vulnerable-host distributions, to quantify its relationship with network-aware worms, and to understand the effectiveness of defense strategies. In particular, we would like to answer the following questions: How to quantify the non-uniformity of a vulnerable-host distribution by a simple metric? How to measure the spreading ability of network-aware worms quantitatively? How to relate vulnerable-host distributions with network- aware worm spreading ability? What are the challenges to defense strategies on slowing down the spread of a network-aware worm? To answer these questions, we first observe, from five measurement sets, common characteristics of non-uniform vulnerable-host distributions. We then derive a new metric as the non-uniformity factor to characterize the non-uniformity of a vulnerable-host distribution. A larger non-uniformity factor 1 Hitlist scanning [21] can be regarded as a special case of importance scanning when the complete information of vulnerable hosts is known.