Proxi-Annotated Control Flow Graphs: Deterministic Context-Sensitive Monitoring for Intrusion Detection Samik Basu 1 and Prem Uppuluri 2 1 Dept. of Computer Science Iowa State University Ames IA 50011-1040 sbasu@cs.iastate.edu 2 Dept. of Computer Science and Electrical Engineering University of Missouri Kansas City MO 64110 uppulurip@umkc.edu Abstract. Model or specification based intrusion detection systems have been effective in detecting known and unknown host based attacks with few false alarms [12, 15]. In this approach, a model of program behavior is developed ei- ther manually, by using a high level specification language, or automatically, by static or dynamic analysis of the program. The actual program execution is then monitored using the modeled behavior; deviations from the modeled behavior are flagged as attacks. In this paper we discuss a novel model generated using static analysis of executables (binary code). Our key contribution is a model which is precise and runtime efficient. Specifically, we extend the efficient control flow graph (CFG) based program behavioral model, with context sensitive informa- tion, thus, providing the precision afforded by the more expensive push down systems (PDS). Executables are instrumented with operations on auxiliary vari- ables, referred to as proxi variables. These annotated variables allow the resulting context sensitive control flow graphs obtained by statically analyzing the exe- cutables to be deterministic at runtime. We prove that the resultant model, called proxi-annotated control flow graph, is as precise as previous approaches which use context sensitive push-down models and in-fact, enhances the runtime effi- ciency of such models. We show the flexibility of our technique to handle differ- ent variations of recursion in a program efficiently. This results in better treatment of monitoring programs where the recursion depth is not pre-determined. 1 Introduction Intrusion detection systems (IDS) have shown promise in detecting a large number of host based attacks [4, 12, 15]. They can be categorized into (a) misuse based systems [4, 13], which detect previously known attacks by monitoring the system behavior, (b) anomaly based systems [9, 2, 11] in which machine learning or expert systems learn a system’s behavior and attacks are detected as deviations of actual program behavior from learnt behavior, and, (c) specification/model based systems [7, 12] in which the intended program behavior is modeled and attacks are detected as deviations from this behavior. Out of these approaches, misuse based approaches cannot detect unknown