Cuckoo Sampling: Robust Collection of Flow Aggregates under a Fixed Memory Budget Josep Sanju` as-Cuxart Pere Barlet-Ros Nick Duffield Ramana Kompella UPC BarcelonaTech AT&T Labs–Research Purdue University Abstract—Collecting per-flow aggregates in high-speed links is challenging and usually requires traffic sampling to handle peak rates and extreme traffic mixes. Static selection of sampling rates is problematic, since worst-case resource usage is orders of magnitude higher than the average. To address this issue, adaptive schemes have been proposed in the last few years that periodically adjust packet sampling rates to network conditions. However, such proposals rely on complex algorithms and data structures of costly maintenance. As a consequence, adaptive sampling is still not widely implemented in routers. We present a novel flow sampling based measurement scheme called Cuckoo Sampling that efficiently collects per-flow aggre- gates, while smoothly discarding information as it exceeds the available memory. After a measurement epoch, it provides a random sample of the input flows, at a close-to-maximum rate as allowed by the available memory budget. Our proposal relies on a very simple data structure, requires few per-packet operations, has a CPU cost that is independent of the memory budget and traffic profile, and is suitable for hardware implementation. We back the theoretical analysis of the algorithm with experiments with both synthetic and real network traffic, and show that our algorithm requires significantly less resources than existing adaptive sampling schemes. I. I NTRODUCTION As networks grow more complex and hard to manage, the deployment of devices that monitor network conditions has become a necessity. Network monitoring can aid in tasks such as fault diagnosis and troubleshooting, evaluation of network performance, capacity planning, traffic accounting and clas- sification, and to detect anomalies and investigate security incidents. However, network traffic analysis is challenging in high-speed data links. In current backbone links, incoming packet rates leave very little time (e.g., 32 ns in OC-192 links in the worst case) to process each packet. Additionally, storing all traffic is inviable; usually, operators only record traffic aggregates on a per-flow basis, as a means to obtain significant data volume reduction. A paradigmatic example and, arguably, the most widespread flow-level measurement tool is NetFlow [1], which provides routers with the ability to export per-flow traffic aggregates. However, in today’s networks, one can expect the number of active flows to be very large and highly volatile. Under anomalous conditions, including network attacks such as worm outbreaks, network scans, or even attacks that target the measurement infrastructure itself, the number of active flows can rise by orders of magnitude. Thus, not only must the router be able to process each packet very quickly, but must also maintain a potentially enormous amount of state. As a consequence, provisioning monitors for worst-case scenarios is prohibitively expensive [2]. The most widely adopted approach both to prevent memory exhaustion and to reduce packet processing time is to sample the traffic under analysis. For example, Sampled NetFlow [1] is a standard mechanism that samples the incoming traffic on a per-packet basis. Sampled NetFlow requires the configuration of a fixed (static) sampling rate by the network operator. The main problem of such an approach is that operators tend to select “safe” parameters that ensure network devices will continue to operate under adverse traffic conditions. As a result, the sampling rates are set with the worst-case scenario in mind, which harms the completeness of the measurements under normal conditions. Several works have addressed the problem of dynamic packet sampling rate selection, which overcomes the draw- backs of setting static sampling rates by adapting to network conditions (e.g., [2]–[4]). Most notably, Adaptive NetFlow [3] maintains a table of active flows; when the table becomes full, the algorithm lowers the sampling rate and updates all table entries as though packets had been initially sampled at the resulting (lower) rate; flows for which the packet count becomes zero are discarded. However, adaptive sampling schemes, including Adaptive NetFlow, are still not widely used. For example, Cisco’s NetFlow still relies on static sampling. We believe that the main reasons for this are that existing adaptive sampling schemes are too costly in terms of CPU requirements, and rely on complex data structures and algorithms, which makes them less attractive for implementation in networking hardware (we review the related work in Sec. II, while Sec. III presents Adaptive NetFlow in greater detail). In this work, we turn our attention to flow-wise packet sampling [5] (also known as flow sampling), which allows us to find an elegant solution to the problem of adaptive sampling. We present a novel measurement scheme which we have named Cuckoo Sampling (Sec. IV) that performs aggregate per-flow network measurements and, when the state required to track all incoming traffic exceeds a memory budget, maintains the largest possible random selection of the incoming flows, i.e., under overload, performs flow sampling at the appropriate rate. Our algorithm can cope with the extreme data rates of today’s fast network links. The data structure is extremely efficient both when the traffic conforms to the available memory budget, but also under overload, when flow sampling is necessary.