Identifying and Modeling Botnet C&C Behaviors Sebastián García Department of Computer Science, Czech Technical University in Prague sebastian.garcia@agents .fel.cvut.cz Vojt ˇ ech Uhlíˇ r Department of Computer Science, Czech Technical University in Prague vojtech.uhlir@agents .fel.cvut.cz Martin Rehak Cisco Systems Department of Computer Science, Czech Technical University in Prague rehak@cisco.com ABSTRACT Through the analysis of a long-term botnet capture, we identified and modeled the behaviors of its C&C channels. They were found and characterized by periodicity analyses and statistical represen- tations. The relationships found between the behaviors of the UDP, TCP and HTTP C&C channels allowed us to unify them in a gen- eral model of the botnet behavior. Our behavioral analysis of the C&C channels gives a new perspective on the modeling of malware behavior, helping to better understand botnets. Categories and Subject Descriptors H.4 [Intrusion/anomaly detection and malware mitigation]: Mal- ware and its mitigation General Terms Experimentation, Security Keywords Malware, Botnet, Network Behavior, Network Security 1. INTRODUCTION The Botnets are still the most important source of attacks on the Internet. While there are a lot of methods to detect them [4], there is yet a need to improve their detection performance. Since most of these methods use a behavioral model of the botnet on their al- gorithms and since most of them focus on the behavior of their Command and Control channels (C&C), we believe that having a better model of the behavior of these channels may help to create better detection algorithms. We studied the traffic of a Zbot family Botnet during 57-days to ex- tract the characteristics of its C&C channels and then create states models of their behavior. We hypothesize that only using a long capture it is possible to see the general behaviors of a botnet. The comparison of these C&C models allow us to also create a model of the botnet itself. Our contributions can be summarized as follows: • A deep analysis of the behavioral characteristics that distin- guish the C&C channels. (Section 3). • A state model of each C&C channel and the botnet. (Sec- tion 3). • An analysis of the relationship between the channels and the botnet actions. • A novel 57-days labeled and long botnet dataset. (Section 4) The network topology used by Botnets has a strong relationship with multi-agents systems. Botnets typically consists of millions of interconnected computers that are controlled by one or more Botmasters, who in turn are following specific goals. The Bot- masters send orders to each Bot, and the Bots reacts depending on the changing context. The individual Bots usually do not take com- plex decisions by themselves but they rely on the orders sent by the Botmaster. From this point of view, Botnets may possible by the largest implementation of a real-world distributed and efficient multi-agent system. Our work helps understand the details of the network behavior of each agent and ultimately helps to improve the agent-based security research area. We conclude that it is possible to extract the behavioral patterns of a botnet C&C channels, to find hidden relationships between them, to correlate those patterns and to build state models of botnet behavior. 2. PREVIOUS WORK The modeling of botnet behavior usually focus on their C&C chan- nels since they are their most important characteristic. If they can be modeled, then they may be used to better detect botnets. An ana- lysis of the periodicity of the C&C channels by studying its Power Spectral Density was presented by Basil et al. [13]. The difference with our work is that they use a simulated and controlled botnet and that they aggregate the packet count of traffic every 100 ms. The behaviors of the P2P protocol depends on the type of algorithm used and the implementation of the botnet. The extensive analysis of the Nugache botnet done by Dittrich et al. in [3] shows that its P2P is encrypted, used TCP protocols and the port 8. Although extensive, the analysis does not include a behavioral analysis of the C&C channel. Another common approach to analyze a C&C channel is to study its statistical features, such as the work presented by Kondo et al. [5]. Among the features analyzed are the packet sizes and the packets time interval. However, there is no information about which C&C it