Improving Intrusion Detection Performance Using Keyword Selection and Neural Networks Richard P. Lippmann Robert K. Cunningham MIT Lincoln Laboratory, Rm S4-121 244 Wood Street Lexington, MA 02173-0073 rpl@sst.ll.mit.edu phone: (781) 981-2711 MIT Lincoln Laboratory, Rm S4-129 244 Wood Street Lexington, MA 02173-0073 rkc@sst.ll.mit.edu Abstract The most common computer intrusion detection systems detect signatures of known attacks by searching for attack-specific keywords in network traffic. Many of these systems suffer from high false-alarm rates (often 100’s of false alarms per day) and poor detection of new attacks. Poor performance can be improved using a combination of discriminative training and generic keywords. Generic keywords are selected to detect attack preparations, the actual break-in, and actions after the break-in. Discriminative training weights keyword counts to discriminate between the few attack sessions where keywords are known to occur and the many normal sessions where keywords may occur in other contexts. This approach was used to improve the baseline keyword intrusion detection system used to detect user-to-root attacks in the 1998 DARPA Intrusion Detection Evaluation. It reduced the false alarm rate by two orders of magnitude (to roughly 1 false alarm per day) and increased the detection rate to roughly 80%. The improved keyword system detects new as well as old attacks in this data base and has roughly the same computation requirements as the original baseline system. Both generic keywords and discriminant training were required to obtain this large performance improvement. 1. Introduction Heavy reliance on the internet and worldwide connectivity has greatly increased the potential damage that can be inflicted by remote attacks launched over the internet. It is difficult to prevent such attacks by security policies, firewalls, or other mechanisms because system and application software always contains unknown weaknesses or bugs, and because complex, often unforeseen, interactions between software components and/or network protocols are continually exploited by attackers. Intrusion detection systems are designed to detect attacks which inevitably occur despite security precautions. A review of the many alternative approaches to intrusion detection is available in [1]. The most common approach to intrusion detection, often called “signature verification,” detects previously seen, known, attacks by looking for an invariant signature left by these attacks. This signature may be found either in host-based audit records on a victim machine or in the stream of network packets sent to and from a victim and captured by a “sniffer” which stores all important packets for on-line or future examination. The Network Security Monitor (NSM) was an early signature-based intrusion detection system that found attacks by searching for keywords in network traffic captured using a sniffer. Early versions of the NSM [2] were the foundation for many government and commercial intrusion detection systems including NetRanger [3] and NID [4]. This type of system is popular because one sniffer can