IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 21, Issue 3, Ser. I (May - June 2019), PP 18-27 www.iosrjournals.org DOI: 10.9790/0661-2103011827 www.iosrjournals.org 18 | Page Performance Evaluation of Machine Learning Algorithms for Detection and Prevention of Malware Attacks Emmanuel Gbenga Dada 1 , Joseph Stephen Bassi 1 , Yakubu Joseph Hurcha 2 and Abdulkadir Hamidu Alkali 1 1 Department of Computer Engineering, Faculty of Engineering,University of Maiduguri, Maiduguri - Borno State Nigeria. 2 Department of Mathematical Sciences,Faculty of Science,University of Maiduguri, Borno state, Nigeria. Corresponding Author:Emmanuel Gbenga Dada Abstract:Malware is any type of program that is intended to wreak havoc to the computer system and network. Examples of malware are bot, ransomware, adware, keyloggers, viruses, trojan horses, worms and others. The exponential growth of malware is posing a great danger to the security of confidential information. The problem with many of the existing classification algorithms is their low performance in term of their ability to detect and prevent malware from infecting the computer system. There is an urgent need to evaluate the performance of the existing Machine Learning classification algorithms used for malware detection. This will help in creating more robust and efficient algorithms that have the capacity to overcome the weaknesses of the existing algorithms. This study did the performance evaluation of some classification algorithms such as J45, LMT, Naïve Bayes, Random Forest, MLP Classifier, Random Tree, REP Tree, Bagging, AdaBoost, KStar, SimpleLogistic, IBK, LWL, SVM, and RBF Network. The performance of the algorithms was evaluated in terms of Accuracy, Precision, Recall, Kappa Statistics, F-Measure, Matthew Correlation Coefficient, Receiver Operator Characteristics Area and Root Mean Squared Error using WEKA machine learning and data mining simulation tool. Our experimental results showed that Random Forest algorithm produced the best accuracy of 99.2%. This positively indicates that the Random Forest algorithm achieves good accuracy rates in detecting malware. Keywords: Malware, classification algorithms, Random Forest, AdaBoost, Bagging, Naïve Bayes --------------------------------------------------------------------------------------------------------------------------------------- Date of Submission: 26-04-2019 Date of acceptance: 11-05-2019 --------------------------------------------------------------------------------------------------------------------------------------------------- I. Introduction The breakthrough in internet technology and computer networking have made high speed shared internet possible. The effect of this development is the daily increase in the number of computer systems that have become susceptible to malware attacks 1, 2 . The innovation has made the internet a huge storehouse where resources are virtualized and utilised to the need of users. Despite the immense benefits that the internet revolution has brought, there are numerous challenges that it also poses to the security of computer systems. The conventional computer system is entirely centered on a single host machine running operating system, while several machines connected to the host are running on the guest operating system 1 . The prevalent security threat confronting the users is the attack on a computer system by malicious programs which spread to other computers that have not been infected 3 . The threat posed by malware infections has become a major challenge in the field of computer security over the years. The number of new malware on the internet keep on increasing at an alarming rate even as anti-virus companies are making effort to curtail the trend so as to make the vast number of computer user safe. Malware has evolved over time and is becoming more sophisticated than before. It is now more difficult to detect them. There is therefore the need to invent more efficient techniques that can detect and prevent these attacks. Malware is a malicious program which infringes on the security of a computer system in terms of privacy, reliability, and accessibility of data 3 . This trend has made academicians and industry practitioners to move from the conventional static detection techniques 4, 5 to more dynamic, sophisticated and spontaneous methods that applies accumulated malware behaviour to detect malware attacks 6, 7, 8 . A malware can simply be defined as a malicious program which the user unsuspectingly install on their machine and later these programs can begin to disrupt the proper operation of the machine or might continue unnoticed and carry out malicious actions without been noticed 9 . When the attacker gains control of the machine, he can then have access to any information stored on the machine. Some of the deceptive approaches used to install malware on the computer system through the internet include repackaging the software, update attack 10 or desire for download 11 . The attacker employs any of the methods mentioned before to create