An Optimal Control Approach to Malware Filtering Michael Bloem 1,2 NASA Ames Research Center Moffett Field, CA 94035-1000, USA michael.j.bloem@nasa.gov Tansu Alpcan 1 Deutsche Telekom Laboratories Technische Universit¨ at Berlin Ernst-Reuter-Platz 7, 10587, Germany tansu.alpcan@telekom.de Tamer Bas ¸ar 2 Coordinated Science Lab University of Illinois 1308 West Main Street Urbana, IL 61801, USA tbasar@control.csl.uiuc.edu Abstract—We study and develop an optimal control theoretic approach to malware ﬁltering in the context of network security. We investigate the malware ﬁltering problem by capturing the tradeoff between increased security on one hand and continued usability of the network on the other. We analyze the problem using a linear control system model with a quadratic cost structure and develop algorithms based on H ∞ -optimal control theory. A dynamic feedback ﬁlter is derived and shown to be an improvement over various heuristic approaches to malware ﬁltering via numerical analysis. The results obtained are veriﬁed and demonstrated with packet level simulations on the Ns-2 network simulator. I. I NTRODUCTION The cost of malicious software and attacks to computer networks is well documented, and these costs only grow as corporations and organizations become more dependent upon networked systems and attacks increase in sophistication. As networks become more complex, preventing and responding to malicious attacks also become more costly. Estimates put the cost of the Code Red virus attack of 2001 at $740 million in cleanup, monitoring, and system checking, and $450 million in lost productivity [1]. Aside from their daunting magnitude, these two types of costs are interesting in the sense that they can be traded off against each other. Attacks on computer networks, such as worm or denial of services attacks, are expensive in part due to the challenge of preventing them while allowing legitimate network usage. The base-rate fallacy captures the essence of this problem. Even if we have low false-negative and false-positive rates in our detection of attacks, there is so much more legitimate network usage than illegitimate usage that we end up with many false alarms [2]. Intrusion detection systems must be constructed with this dilemma in mind, and thus need to be conservative in their operation. In this paper, we use an optimal control approach to investigate how to dynamically choose an optimal security level high enough to adequately prevent costly attacks but not so high as to excessively prevent legitimate network usage. Speciﬁcally, we apply optimal control theory based methods to address the question of how to set up dynamic network trafﬁc ﬁlters to prevent attacks or slow the spread of 1 Research supported in part by Deutsche Telekom AG. 2 Research supported in part by a grant from the Boeing Company, through the Information Trust Institute at the University of Illinois at Urbana- Champaign. malicious software or malware within a single network by protecting sub-networks. Our aim is to develop algorithms and policies for conﬁgurable ﬁrewalls [3] in order to ﬁlter malware trafﬁc such as worms, viruses, spam, and trojans. We use H ∞ -optimal control to determine how to dynam- ically change ﬁltering rules in order to ensure a certain performance level. We note that in H ∞ -optimal control, by viewing the disturbance as an intelligent maximizing oppo- nent in a dynamic zero-sum game, who plays with knowledge of the minimizer’s control action, one evaluates the system under the worst possible conditions. This approach applies naturally to the problem of malware response because the trafﬁc deviation resulting from a malware attack is not merely random noise, but represents the efforts of an intelligent attacker. Therefore, we determine the control action that will minimize costs under these worst circumstances [4]. We study the algorithms developed via simulations in Matlab and Ns-2 network simulator and verify the optimality of our solution in various scenarios. To the best of the knowledge of the authors, this work represents the ﬁrst application of H ∞ -optimal control theory to the problem of malware ﬁltering. A. Related Work There are several methods of dynamic packet ﬁltering [5]. Perhaps the most common one is to dynamically change which ports are open or closed. Stateful inspection of deeper layers of packets allows for even more detailed ﬁltering by creating and maintaining information about the state of a current connection [3]. Another possibility is to dynamically alter the set of IP addresses from which trafﬁc will be accepted [6]. Implicit to the network trafﬁc ﬁltering problem considered in this paper is the partitioning of a computer network into various sub-networks for administrative and security pur- poses. This approach is common, and a separate ﬁrewall is often assigned to each sub-network. Zou et al. have proposed a “Firewall Network System” based on this very concept in [7]. Cisco recommends their IOS ﬁrewalls for defending particular sub-networks or LANs in a corporate network [3]. In [8], quarantining these sub-networks is considered as a strategy to slow the spread of worm epidemics. We note that although the algorithms developed in this paper can be helpful for conﬁguring dynamic ﬁrewalls such as the