www.astesj.com 529 Risk Management: The Case of Intrusion Detection using Data Mining Techniques Ruba Obiedat * King Abdullah the Second School of Information Technology, The University of Jordan, 11942, Jordan A R T I C L E I N F O A B S T R A C T Article history: Received: 02 May, 2020 Accepted: 14 June, 2020 Online: 25 June, 2020 Every institution nowadays relies on their online system and framework to do businesses. Such procedures need more attention due to the massive amount of attacks that occurs. These procedures have to go first through the management team of the institution, in order to prevent exploits of the attackers. Thus, the risk management can easily control and identify the risk that occurs. One of these risks is an intrusion, which is an action or an act that the attacker invades someone’s privacy to steal or damage their information. Various techniques have been proposed to prevent these actions in the literature. This research proposed an intrusion detection model to distinguish the most recent attacks using data mining techniques. Three machine learning classification models have been applied namely, J48, Random Forst and REPTree to improve the detection rate. Furthermore, a Feature Selection method has been applied in order to improve the effectiveness of the classifier and also overcome the high dimensionality which presents one of the main technical problems facing the intrusion detection systems and come up with the most important intrusion features affecting the system. These features can be very useful in protecting the systems from attackers. The results identify the top 11 effective features. The best results achieved by the J48 with a 76.271% accuracy rate. Keywords: Risk Management Intrusion Detection Data Mining Machine Learning 1. Introduction In the recent years, people all around the world become more depending on the information technology in all kinds, notably, with the expansion of the internet and automated businesses, and procedures from different fields [1]. People utilized computers and applications in order to acquire information about several things; such as stock costs, news, and online trade. On the other hand, others save the information of patient’s medical records, credit card, and other personal data on their systems, either offline or online. Many organizations, for example, have a web presence as a fundamental structure of their businesses. Controlling and secure these critical assets and information that come from the decision of the management team [2, 3]. Consequently, without a good plan and scheme, risk can occur regularly. This kind of risk can harm the company for example, in a severe way, especially controlling the intrusion and detect each attack that happens [2]. The availability and integrity of the systems must be ensured against various threats, such as hacking or damaging, in which eventually can hurt the image of the institution. Hence, the secure information and the communication turned out to be indispensably vital. Furthermore, the need to detect privacy breaches and information security demands of a robust intrusion detection and prevention systems (IDPSs) are more necessity [2-5]. Several techniques have been proposed to prevent such vulnerability. The most recent and effective one is “Data Mining”, which is a method of understanding and finding the pattern of the data [6, 7]. This data usually collected from different sources, each one of them portrays a case that occurs on different scenarios. In real life, data can portray as stones and sand, and mining these gravels to extract the jewelry (useful information). Therefore, extracting such benefit information can lead us to prevent attacks and leaking information in a more effective way. In our case, the dataset of these logs is collected, where the instances of each attack and non-attack portrayed as a row, while the columns are the characteristics of each instance. These characteristics called features. Another critical process can be used to help us knowing and understanding the pattern, and hidden information is called Feature Selection. It is a way to remove redundancy and not important features from the dataset and keep the most important ones without affecting the essential information of the data, it relies on identifying the features which are independent of each other but ASTESJ ISSN: 2415-6698 * Ruba Obiedat, Email: r.obiedat@ju.edu.jo Advances in Science, Technology and Engineering Systems Journal Vol. 5, No. 3, 529-535 (2020) www.astesj.com https://dx.doi.org/10.25046/aj050365