Eigen Co-occurrence Matrix Method for Masquerade Detection Mizuki Oka 1 Yoshihiro Oyama 2,3 Kazuhiko Kato 3,4 1 Master’s Program in Science and Engineering, University of Tsukuba 2 Department of Computer Science, Graduate School of Information Science and Technology, University of Tokyo 3 CREST, Japan Science and Technology Agency 4 Institute of Information Sciences and Electronics, University of Tsukuba 1 Introduction Masquerading can be a serious threat to the security of com- puter systems. Masqueraders are the people who attempt to masquerade a legitimate user. Masqueraders can be de- tected by anomaly-based intrusion detection. In anomaly-based intrusion detection, usually a user pro- file that represents a user’s typical behavior is built. If the input data deviates largely from the profile, the input data is classified as intrusive, otherwise classified as normal. Ex- amples of typical data used are: time of login, physical lo- cation of login, duration of user session, programs executed, names of files accessed, user commands issued, and so on [7]. There are two steps to decide an input data as normal or intrusive. The first step is extraction of features of the data. The second step is classification of the extracted features of the data. The conventional approaches to the first step, feature ex- traction, include the usage of histogram (frequency of ap- pearance of events) [16] and n-gram (n-connective events) [6]. These techniques are often used in the field of text clas- sification. The sequences of events used in masquerade de- tection are composed of events that do not have well-defined syntactic rules in sequences. Instead, rules are embedded as a kind of causal relationship between two events in se- quences. The rules are possible to be extracted by means of feature extraction. The feature extraction by histogram do not extract sequential rule of the sequences. The n-gram is another kind of feature. It deals with the distribution of n-connective neighboring aspect of events. Neither of them do not consider the causal relationship (a kind of rules) be- tween events with connective or non-connective character- istics in sequences. As for the second step, a various approaches of classi- fication in the field of pattern classification have been ap- plied to anomaly-based intrusion detection such as rule- based [4] [5] [12], automaton [10] [14] [1], Bayesian network[2],Naive Bayes [8], Neural Network [3], Markov chain [11] and Hidden Markov chain [15]. We focus on the first step, feature extraction, and propose a new method called Eigen Co-occurrence Matrix (ECM) method to extract the causal relationship embedded in se- quences of events. In order to extract such features we use an event co-occurrence matrix. A co-occurrence means the relationship between every two events within an interval of sequences of data, which neither histogram nor n-gram take into consideration. We then perform Principal Component Analysis (PCA) on a set of event co-occurrence matrices to create a set of orthogonal axes. Each interval of input data is represented as a vector in the new orthogonal vector space. This process reduces the representation of large co- occurrence matrix and enables to obtain essential features embedded in the co-occurrence matrix. Extracting features in the form of vector representation enables to use various classification method introduced in pattern classification. One of the benefit of ECM is the adaptation to conceptual drift. Conceptual drift is a change of behavior of a user or a system over time. An effective intrusion detection system needs to adapt this change while not adapting to intrusion behavior. On the one hand, the conventional approaches challenge the problem of conceptual drift by updating a classifier. The result classified as normal by a classifier is added to the pro- file. This is a kind of feedback mechanism. On the other, ECM adds a pre-phase that updates a training data set. The training data set is used for creating a set of orthogonal axes on which input data is projected to extract a feature vector. The training data can contain any data no matter if the data is normal or intrusive. The reminder of this paper is organized as follows: Sec- tion 2 gives an overview of the ECM method. Section ?? describes the ECM method formally using an example data set. Section 3 applies the ECM method to masquerade de- tection to evaluate the method. Section 4 provides results from experiments. Section 5 contains a summary and areas of future work. 2 Overview of ECM method The ECM method is a novel feature extraction method. ECM represents features embedded in an event sequence in terms of a causal relationship between events. ECM con- siders strength of the causal relationship between events in terms of a distance between two events and their frequency of appearance in the event sequence. The closer the distance between two events is, the stronger their causal relation- ship becomes. Similarly, the more frequent an event-pair appears, the stronger their causal relationship becomes. ECM represents the causal relationship by converting a window of an event sequence to an event co-occurrence ma- trix. An element of the co-occurrence matrix represents the occurrence of an event-pair in the event sequence within a scope size. Scope size defines what extent the causal rela- tionship of an event-pair should be considered. 1