Multi-Perspective Machine Learning A Classifier Ensemble method for intrusion detection Sean T Miller The University of the West Indies Mona Department of Computing Kingston, Jamaica sean.miller@mymona.uwi.edu Curtis Busby-Earle The University of the West Indies Mona Department of Computing Kingston, Jamaica curtis.busbyearle@uwimona.edu.jm ABSTRACT Today cyber security is one of the most active fields of re- search due to its wide range of impact in business, govern- ment and everyday life. In recent years machine learning methods and algorithms have been quite successful in a num- ber of security areas. In this paper we explore an approach to classify intrusion called multi-perspective machine learn- ing (MPML). For any given cyber-attack there are multiple methods of detection. Every method of detection is built on one or more network characteristic. These characteristics are then represented by a number of network features. The main idea behind MPML is that, by grouping features that support the same characteristics into feature subsets called perspectives, this will encourage diversity among perspec- tives (classifiers in the ensemble) and improve the accuracy of prediction. Initial results on the NSL- KDD dataset show at least a 4% improvement over other ensemble methods such as bagging boosting rotation forest and random for- est. CCS Concepts •Machine learning → Learning paradigms; Supervised learning; •Security and privacy → Intrusion detection systems; •Machine learning algorithms → Ensemble meth- ods; Keywords Machine learning; ensemble methods; cybersecurity; intru- sion detection 1. INTRODUCTION Machine learning (ML) and cyber security are two of the most active areas of research. Many studies and systems based on machine learning have been proposed for detecting cyber threats [1, 8, 10] . As the world becomes more de- pendent on internet services and information systems, the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. WOODSTOCK ’97 El Paso, Texas USA c 2016 ACM. ISBN 123-4567-24-567/08/06. DOI: 10.1145/1235 impact of cybercrimes will continue to grow as will the rel- evance of further research in this area. Machine learning methods have had success on different cyber threats such as intrusion detection [7, 10] botnets [4, 6] and misuse detec- tion [1]. This paper explores a machine learning ensemble method multi-perspective machine learning (MPML). MPML is an ensemble method built on the multi-view learning concept. The main aim of multi-perspective machine learning (MPML) is to improve the accuracy of malware detection through the use of carefully selected malware characteristics.These char- acteristics are represented by different subsets of features. We call these features perspectives. Finally these features are then used to train classifiers whose results are then com- bined to give a final prediction. This paper is arranged as follows: Section 1 introduces the basic idea behind MPML. Section 2 explores some re- lated work and methods similar to MPML, highlighting the similarities and differences. Section 3 defines the multi- perspective approach. Section 4 presents the details about the dataset used in the experiments. Section 5 describes the details of the experiments and their results. Section 6 con- cludes the paper with a discussion of the results and possible future work. 2. RELATED WORK 2.1 Multi-View Learning Multi-view learning is a rapidly growing area in machine learning with strong theoretical foundations and promising practical success [15]. Multi-view learning is concerned with machine learning tasks represented by multiple distinct fea- ture sets or views. Multi-view learning can be applied to dif- ferent types of learning methods, supervised learning, semi- supervised learning, ensemble learning, active learning etc. MPML applies the multi-view learning concept to ensem- ble learning by creating distinct features subsets based on characteristics seen in the data set. 2.2 Ensemble methods Ensemble methods combine individually trained classifiers whose predictions are evaluated to provide a single output [3]. Some ensemble methods include: 2.2.1 Bagging Bagging is a bootstrap ensemble method that creates in- dividual classifiers for its ensemble by training each classifier on a random redistribution of the training set [12]. In these