Topology-dependent Performance of Attack Graph Reconstruction in PPM-Based IP Traceback Ankunda R. Kiremire, Matthias R. Brust, and Vir V. Phoha Louisiana Tech University Ruston, Louisiana, USA {ark010, mbrust, phoha}@latech.edu Abstract—A variety of schemes based on the technique of Probabilistic Packet Marking (PPM) have been proposed to identify Distributed Denial of Service (DDoS) attack traffic sources by IP traceback. These PPM-based schemes provide a way to reconstruct the attack graph – the network path taken by the attack traffic – hence identifying its sources. Despite the large amount of research in this area, the influence of the underlying topology on the performance of PPM-based schemes remains an open issue. In this paper, we identify three network-dependent factors that affect different PPM-based schemes uniquely giving rise to a variation in and discrepancy between scheme performance from one network to another. Using simulation, we also show the collective effect of these factors on the performance of selected schemes in an extensive set of 60 Internet-like networks. We find that scheme performance is dependent on the network on which it is implemented. We show how each of these factors contributes to a discrepancy in scheme performance in large scale networks. This discrepancy is exhibited independent of similarities or differences in the underlying models of the networks. I. I NTRODUCTION Internet Protocol (IP) traceback is a technique for identi- fying the sources of Distributed Denial of Service (DDoS) attacks from its traffic [1]. One approach to implementing IP traceback ensures that the routers embed their identity in packets randomly selected from all the packets they process [2]. In the event of an attack, the victim uses the packets that contain router identities to construct an attack graph. The attack graph is a representation of the routers and links that the attack packets traversed from the attacker(s) to the victim. This IP traceback type is called probabilistic packet marking (PPM) and is implemented by PPM-based schemes. A lot of intensive research has gone into designing PPM- based schemes that are computationally more efficient and robust than the original PPM [1], [2]. However, little work has gone into identifying network-dependent factors that affect the performance of PPM-based schemes in large-scale networks. In fact, most simulations are carried out on disparate tree- structured topologies and the analytical models derived from these topologies are used to predict the performance of the schemes when deployed in a large-scale network such as the Internet [3], [4], [5], [6]. However, since the schemes are implemented on disparate networks, it is difficult to directly compare the performance of different schemes. Furthermore, because typical underlying topologies are tree-structured, it is difficult to make appropriate projections about scheme performance in a large-scale network without implementing the scheme on that network. In this work, we show the influence of network topology on PPM-based scheme performance. We identify three network- dependent factors that affect scheme performance in large- scale networks. These factors include average shortest path length, overlapping of attack paths, and the occurrence of network motifs in attack graphs. Using specific attack graphs, we show the influence of each factor on selected PPM- based schemes. We then use 60 Internet-like networks to show how all the identified factors collectively contribute to the performance of PPM-based schemes in more realistic scenarios. The networks are selected to encompass the variety of mathematical models used by researchers to create networks that adequately describe the structure of the Internet [7]. Results show that PPM-based scheme performance is de- pendent on the network on which it is implemented. In fact, even the order of performance changes from one network to another, i.e. the best performing scheme in one network is not necessarily the best performing scheme in another network. Our results show how the identified factors contribute, both individually and collectively, to the PPM-based schemes’ performance in large scale networks. II. RELATED WORK A. PPM-based schemes for IP traceback Introduced by Savage et al. in [2], Probabilistic Packet Marking (PPM) consists of two algorithms referred to as a marking algorithm and a reconstruction procedure. The marking algorithm is implemented at all routers in the network to ensure that randomly selected packets are embedded with the routers identity. Because of space limitation in the packet header, only one router or edge identity can be embedded in any of the selected packets. Additionally, to ease implementa- tion, packet selection and marking at any router is independent of any other router, and is done with a fixed probability. Assuming a large amount of attack traffic is sent in a DDoS attack, the victim can expect to receive packets with differ- ent markings accounting for all the routers that the packets traversed from attackers to victim. The victim then employs the reconstruction algorithm to build the attack graph using the marked packets. The total number of packets required to reconstruct the attack graph is referred to as the convergence