Pattern Recognition 89 (2019) 161–171 Contents lists available at ScienceDirect Pattern Recognition journal homepage: www.elsevier.com/locate/patcog Integration of deep feature extraction and ensemble learning for outlier detection Debasrita Chakraborty a , Vaasudev Narayanan b , Ashish Ghosh a,∗ a Machine Intelligence Unit, Indian Statistical Institute, 203 B. T. Road, Kolkata, 700108, India b Department of Computer Science and Engineering, Indian Institute of Technology, Dhanbad, 826004, India a r t i c l e i n f o Article history: Received 27 June 2018 Revised 7 December 2018 Accepted 2 January 2019 Available online 3 January 2019 Keywords: Deep learning Autoencoders Probabilistic neural networks Ensemble learning Outlier detection a b s t r a c t It is obvious to see that most of the datasets do not have exactly equal number of samples for each class. However, there are some tasks like detection of fraudulent transactions, for which class imbalance is overwhelming and one of the classes has very low (even less than 10% of the entire data) amount of samples. These tasks often fall under outlier detection. Moreover, there are some scenarios where there may be multiple subsets of the outlier class. In such cases, it should be treated as a multiple outlier type detection scenario. In this article, we have proposed a system that can eﬃciently handle all the afore- mentioned problems. We have used stacked autoencoders to extract features and then used an ensemble of probabilistic neural networks to do a majority voting and detect the outliers. Such a system is seen to have a better and reliable performance as compared to the other outlier detection systems in most of the datasets tested upon. It is seen that use of autoencoders clearly enhanced the outlier detection performance. © 2019 Elsevier Ltd. All rights reserved. 1. Introduction Outliers play an important role in deﬁning the nature of a dataset. They are certain interesting points in data that do not conform to the expected or the natural behavior of the dataset. They are usually anomalies, exceptions, discordant observations, surprises, peculiarities, aberrations, or contaminants in different application domains. An outlier is an observation in the data that is highly unlikely provided a model is built that generates the data [1]. In most of the practical cases, the model is abstract like that of ﬁnding a fraudulent credit card transaction in millions of genuine transactions. The data may also have multiple types of outliers like different types of intrusions in a network. One might consider the intrusions as a single outlier class and approach the problem in a binary or a multi-class fashion. However, the practicality of such an approach is questionable as intrusions are highly diverse and may have different reasons to appear in the data. There are many such cases which make single type outlier detection and multiple type outlier detection a crucial part of data analysis process. Figs. 1 and 2 show the two approaches. In the former one, the diversity of the outlier types are not considered and in the latter, the diversity of the outlier types are considered. ∗ Corresponding author. E-mail address: ash@isical.ac.in (A. Ghosh). There is however a bottleneck that most of the algorithms get stuck at. Chances of ﬁnding an outlier in any dataset is extremely rare. In most of the cases, the number of samples from the out- lier class is even below 10% of the total number of samples in the entire training set. Identifying them becomes one of the diﬃcult problems in data analysis [2]. Sampling techniques are usually not preferred for outlier detection cases. Oversampling a minority (or an outlier) class or undersampling a majority (or the inlier) class usually affects the generalisation capability [3]. Moreover, the ex- treme imbalance (below 10%) is a major challenge for many algo- rithms. This is because the presence of an outlier is often mislead- ing to algorithms like clustering, classiﬁcation or regression. There may also be cases where the outliers may arise due to different reasons and may have diversity among them. When there are mul- tiple types of outliers in the dataset, each having a different prop- erty, taking all the outlier classes as a single class may not make any sense. This differs from a multi-class imbalance problem as in this case the outliers consist of less than 10% of the entire dataset. This article proposes and investigates a new supervised out- lier detection framework inspired by the projection methodology through deep learning. It is shown analytically that the proposed method alleviates the aforesaid drawbacks of the existing standard approaches (we have not considered the unsupervised or semi- supervised outlier detection methodologies in order to give a fair comparison with the proposed method). We have done multiple experiments on several datasets to prove whether the non-linear https://doi.org/10.1016/j.patcog.2019.01.002 0031-3203/© 2019 Elsevier Ltd. All rights reserved.