10.1117/2.1200707.0783 A Markov game approach to cyber security Dan Shen, Genshe Chen, Jose B. Cruz, Jr., Leonard S. Haynes , Martin Kruger, and Erik Blasch High-level data fusion based on Markov game models can refine predic- tive models and capture features relevant to cyber network awareness. Cyber attacks (CAs) have generally been one-dimensional, in- volving denial of service (DoS), computer viruses or worms, and unauthorized intrusion (hacking). Websites, mail servers, and client machines are the major targets. However, recent CAs have diversified to include multi-stage and multi-dimensional attacks with a variety of tools and technologies. Next-generation secu- rity will require network management and intrusion detection systems that combine short-term sensor information with long- term knowledge databases to provide decision support and cy- berspace command and control. Recent efforts to apply data fusion techniques to cyber situ- ational awareness are promising 1, 2 , but assessing the potential impact of an attack and predicting intent, or high-level data fu- sion, continue to present substantive challenges. We propose a new approach to evaluate network defenses in which each pos- sible attack pattern is generated by a data-mining module and estimated by a game-theoretic data fusion module. Our cyberspace security system has two fully interlocking parts, as indicated in Figure 1. The data fusion module permits refinement of primitive awareness and assessment to identifi- cation of new attacks while the dynamic/adaptive feature recog- nition module generates estimates and learns about them. The Markov game method, a stochastic approach, is used to evalu- ate the prospects of each potential attack. Game theory captures the nature of cyber conflict: determining the attacker’s strategies is closely allied to decisions on defense and vice versa. Figure 1 also charts the data mining and fusion structure. For instance, detection of new attack patterns is linked to Level One results in dynamic learning, including deception reasoning, trend/variation identification, and multi-agent learning. Our approach to deception detection is heavily rooted in the appli- cation of pattern-recognition techniques to locate and diagnose anomalous conditions in the cyber environment. Dynamic learn- Figure 1. A data-mining/data-fusion approach for cyber situational awareness and impact assessment ing and refinement can also enhance Level Two and Level Three data fusion. To address network security from a system control and de- cision perspective, we present a Markov game model in line with the standard definition. 3 Cyber attackers, defense-system users, and normal network users are players (decision makers). All possible states of involved network nodes constitute the state space. For example, the web-server is controlled by attackers, and to determine the optimal deployment of the intruder detection system, we include the defensive status for each network node in the state space. In addition, at every time step, each player chooses targets with associated actions based on local network information. Finally, the transition rule calculates a probability distribution over the state space for the next time step. Our simulation of a network scenario with 269 computers, 10 routers, and 18 switches (see Figure 2) demonstrates that we can detect and defend two-stage cyber attacks in which a target com- puter (web server) is first infected or hacked and then used to Continued on next page