International Journal of Computer Applications (0975 – 8887) Volume 50 – No.6, July 2012 22 Exploratory Data Model for Effective WLAN Anomaly Detection based on Feature Construction and Reduction Ajay M. Patel Assistant Professor, Acharya Motibhai Patel Institute of Computer Studies, Ganpat University, Ganpat Vidyanagar-384012, India A. R. Patel Director, Department of Computer Application & Information Technology, H. North Gujarat University, Patan - 384265, India Hiral R. Patel Assistant Professor, Department of Computer Science, Ganpat University, Ganpat Vidyanagar-384012, India ABSTARCT The efficient and effective Anomaly detection system essentially requires identifying the behavior analysis for each activity. For this purpose unsupervised techniques are used but the accuracy and reliability of them results depend on the data set which have used for modeling. It is essential to identify important input features, missing values, redundancy, feature exploration etc… So for the data preprocessing different statistical analytical methods are used. In this paper, a statistical feature construction scheme is proposed based on Factor analysis. The proposed Feature construction model provides the way to remove redundancy, identify missing values and co-linearity between the initial data set. Experimental result shows the related good features are factorized using statistical measures. So it will improve the performance of the unsupervised algorithm results for the effective anomaly detection system. General Terms Dimension Reduction, Normality, Eigen Value, Linearity Keywords Anomaly Detection, Factor Analysis, Feature Construction, Intrusion, Linearity, Reduction 1. INTRODUCTION Current trends of growing rate of inter-connections among computer systems with reliable network communication are becoming a major challenge. Network-based systems have increasingly become targets for attackers, and many of these attacks have led to information and financial losses. So, now a day, Intrusion detection Systems (IDS) proposed to combat threats by accomplishing prevention based security measures. Intrusion means exploitation of the any system security policy and intrusion detection is a mechanism developed to detect exploitation of the security policy of the system. This is based on the assumption that intrusive activities are conspicuously different with normal activities of the system so they can easily detectable. Intrusion Detection is planned to accompaniment existing security measures by effectuating actions that bypass the system security policy monitoring and control. 1 Generally Intrusion Detection Systems (IDS) are being designed to protect the availability, confidentiality, and integrity of critical networked information systems to achieve a high level of intrusion detection, the framework should be based on a feature 1 (Matrix Factorization Approach for Feature Deduction and Design of Intrusion Detection Systems) space that provides a good characterization of anomalous activity. 2 The general classification of intrusion detection system is Misuse detection also known as signature-based approach and Anomaly detection also known as Profile-based approach. The signature-based approach detects the intrusions by searching network audit data for the matches of the signatures of known attacks. Profile-based approach detects intrusions by searching the network audit data for deviations from the established profiles of normal behaviors of users and systems. This paper focuses on anomaly detection. In anomaly detection techniques, usually a profile for normal behavior is initially established. The observed behavior of the subject is then compared with its normal profile, and an intrusion is signaled when the observed behavior of a subject deviates significantly from its normal profile. The primary advantage of anomaly-based detection is the ability to detect novel attacks for which signatures have not been defined. Profiles of normal behavior can be built with a variety of techniques including statistical methods and data mining algorithms or by some other methods. These algorithms require a set of purely normal data from which they train their model; if the training data contain traces of intrusions, the algorithm may not detect future instances of these attacks because it will presume that they are normal. In Most of circumstances normalized data can be extremely difficult or impossible to obtain. Intrusion detection system requires the efficient unsupervised algorithms for intrusion behavior analysis and for better efficiency of unsupervised algorithm normalized data required. 3 The main purpose of this paper is to identify important input features to build IDS that are computationally efficient and effective development of classification techniques for unsupervised anomaly detection. To identify important input features, a statistical feature construction scheme has developed in which factor analysis the most popular statistical technique used. Factor analysis is a statistical technique used to identify a relatively small number of factors that can represents relationships among sets of many interrelated variables. It reduces the attribute space from a larger number of variables to a 2 (Feature Construction Scheme for Efficient Intrusion Detection System, 2010) 3 (Factor-analysis based anomaly detection and clustering, Elsevier)