Applied Soft Computing 24 (2014) 977–984 Contents lists available at ScienceDirect Applied Soft Computing j ourna l h o mepage: www.elsevier.com/locate/asoc A comparative study of classiﬁer ensembles for bankruptcy prediction Chih-Fong Tsai a,1 , Yu-Feng Hsu b,2 , David C. Yen c,∗ a Department of Information Management, National Central University, Jhongli, Taiwan, ROC b Department of Information Management, National Sun Yat-Sen University, Kaohsiung, Taiwan, ROC c School of Economics and Business, 226 Netzer Administration Building, SUNY College at Oneonta, Oneonta, NY 13820, United States a r t i c l e i n f o Article history: Received 20 March 2013 Received in revised form 8 April 2014 Accepted 22 August 2014 Available online 6 September 2014 Keywords: Bankruptcy prediction Credit scoring Classiﬁer ensembles Data mining Machine learning a b s t r a c t The aim of bankruptcy prediction in the areas of data mining and machine learning is to develop an effec- tive model which can provide the higher prediction accuracy. In the prior literature, various classiﬁcation techniques have been developed and studied, in/with which classiﬁer ensembles by combining multiple classiﬁers approach have shown their outperformance over many single classiﬁers. However, in terms of constructing classiﬁer ensembles, there are three critical issues which can affect their performance. The ﬁrst one is the classiﬁcation technique actually used/adopted, and the other two are the combina- tion method to combine multiple classiﬁers and the number of classiﬁers to be combined, respectively. Since there are limited, relevant studies examining these aforementioned disuses, this paper conducts a comprehensive study of comparing classiﬁer ensembles by three widely used classiﬁcation techniques including multilayer perceptron (MLP) neural networks, support vector machines (SVM), and decision trees (DT) based on two well-known combination methods including bagging and boosting and different numbers of combined classiﬁers. Our experimental results by three public datasets show that DT ensem- bles composed of 80–100 classiﬁers using the boosting method perform best. The Wilcoxon signed ranked test also demonstrates that DT ensembles by boosting perform signiﬁcantly different from the other clas- siﬁer ensembles. Moreover, a further study over a real-world case by a Taiwan bankruptcy dataset was conducted, which also demonstrates the superiority of DT ensembles by boosting over the others. © 2014 Elsevier B.V. All rights reserved. 1. Introduction Developing an effective bankruptcy prediction model is a very important but rather difﬁcult task for ﬁnancial institutions. The aim of bankruptcy prediction mod- els is to predict whether or not a new applicant (including individual and company) will go bankruptcy or not. If the prediction models could not perform well (i.e. to provide a certain, high prediction error rate) it will lead to make incorrect decisions and hence, very likely to cause great ﬁnancial crises and distress [29]. Similar to the objective of bankruptcy prediction, credit scoring (or rating) focuses on determining if loan customers belong to either a good or a bad appli- cant group. In other words, an effective credit scoring model can also help ﬁnancial instructions decide whether or not to grant a credit to new applicants [10]. Partic- ularly, both bankruptcy prediction and credit scoring are regarded as the ﬁnancial decision making problems as well as binary classiﬁcation problems. That is, the model is designed to assign new observations to two pre-deﬁned classes, which are ‘good’ and ‘bad’ risk classes [26]. That is, if a credit scoring model classiﬁes a new ∗ Corresponding author. Tel.: +1 607 436 3458; fax: +1 607 436 2543. E-mail addresses: cftsai@mgt.ncu.edu.tw (C.-F. Tsai), d974020002@student.nsysu.edu.tw (Y.-F. Hsu), David.Yen@oneonta.edu (D.C. Yen). 1 Tel.: +886 3 422 7151; fax: +886 3 4254604. 2 Tel.: +886 7 525 2000; fax: +886 7 5254799. observation into the ‘bad’ risk class, this is similar to a bankruptcy prediction model that forecasts the new observation to be bankrupt. In other words, a ‘bad’ risk case can be simply regarded as the same as the ‘bankruptcy’ case. Related literature and studies have shown that machine learning techniques, such as neural networks outperform conventional statistical techniques including logistic regression, in terms of prediction accuracy and error [27,29]. In speciﬁc, combining multiple classiﬁcation techniques or classiﬁer ensembles perform far better than single classiﬁcation techniques [17]. Generally speaking, classiﬁer ensembles are based on training a ﬁxed number of classiﬁers for the same domain problems (or the training sets), and the ﬁnal output over a given unknown data sample can be obtained by combining the outputs made by the trained classiﬁers. In literature, bagging and boosting are the two widely used combination methods [17] (c.f. Section 3.2). Although many related studies have demonstrated the superiority of classiﬁer ensembles over many single classiﬁers, most of them only constructed a speciﬁc type of classiﬁer ensembles for bankruptcy prediction, such as neural network ensembles [13,29,31,33] and decision tree ensembles [1,26,32,35]. In addition, most of these classiﬁer ensembles are only based on one speciﬁc combination method, i.e. either bagging or boosting (c.f. Section 3.3). Despite some previous works focus on comparing bagging and boosting meth- ods [5,19], where their ﬁndings show that the boosting method outperforms the bagging method, they conclude that the performances of classiﬁer ensembles by bagging and boosting are usually domain dependent. Therefore, in the domain problems of bankruptcy prediction and credit scoring assessment there is no comparative study to assess the performances of a good http://dx.doi.org/10.1016/j.asoc.2014.08.047 1568-4946/© 2014 Elsevier B.V. All rights reserved.