Applied Soft Computing 24 (2014) 977–984 Contents lists available at ScienceDirect Applied Soft Computing j ourna l h o mepage: www.elsevier.com/locate/asoc A comparative study of classifier ensembles for bankruptcy prediction Chih-Fong Tsai a,1 , Yu-Feng Hsu b,2 , David C. Yen c, a Department of Information Management, National Central University, Jhongli, Taiwan, ROC b Department of Information Management, National Sun Yat-Sen University, Kaohsiung, Taiwan, ROC c School of Economics and Business, 226 Netzer Administration Building, SUNY College at Oneonta, Oneonta, NY 13820, United States a r t i c l e i n f o Article history: Received 20 March 2013 Received in revised form 8 April 2014 Accepted 22 August 2014 Available online 6 September 2014 Keywords: Bankruptcy prediction Credit scoring Classifier ensembles Data mining Machine learning a b s t r a c t The aim of bankruptcy prediction in the areas of data mining and machine learning is to develop an effec- tive model which can provide the higher prediction accuracy. In the prior literature, various classification techniques have been developed and studied, in/with which classifier ensembles by combining multiple classifiers approach have shown their outperformance over many single classifiers. However, in terms of constructing classifier ensembles, there are three critical issues which can affect their performance. The first one is the classification technique actually used/adopted, and the other two are the combina- tion method to combine multiple classifiers and the number of classifiers to be combined, respectively. Since there are limited, relevant studies examining these aforementioned disuses, this paper conducts a comprehensive study of comparing classifier ensembles by three widely used classification techniques including multilayer perceptron (MLP) neural networks, support vector machines (SVM), and decision trees (DT) based on two well-known combination methods including bagging and boosting and different numbers of combined classifiers. Our experimental results by three public datasets show that DT ensem- bles composed of 80–100 classifiers using the boosting method perform best. The Wilcoxon signed ranked test also demonstrates that DT ensembles by boosting perform significantly different from the other clas- sifier ensembles. Moreover, a further study over a real-world case by a Taiwan bankruptcy dataset was conducted, which also demonstrates the superiority of DT ensembles by boosting over the others. © 2014 Elsevier B.V. All rights reserved. 1. Introduction Developing an effective bankruptcy prediction model is a very important but rather difficult task for financial institutions. The aim of bankruptcy prediction mod- els is to predict whether or not a new applicant (including individual and company) will go bankruptcy or not. If the prediction models could not perform well (i.e. to provide a certain, high prediction error rate) it will lead to make incorrect decisions and hence, very likely to cause great financial crises and distress [29]. Similar to the objective of bankruptcy prediction, credit scoring (or rating) focuses on determining if loan customers belong to either a good or a bad appli- cant group. In other words, an effective credit scoring model can also help financial instructions decide whether or not to grant a credit to new applicants [10]. Partic- ularly, both bankruptcy prediction and credit scoring are regarded as the financial decision making problems as well as binary classification problems. That is, the model is designed to assign new observations to two pre-defined classes, which are ‘good’ and ‘bad’ risk classes [26]. That is, if a credit scoring model classifies a new Corresponding author. Tel.: +1 607 436 3458; fax: +1 607 436 2543. E-mail addresses: cftsai@mgt.ncu.edu.tw (C.-F. Tsai), d974020002@student.nsysu.edu.tw (Y.-F. Hsu), David.Yen@oneonta.edu (D.C. Yen). 1 Tel.: +886 3 422 7151; fax: +886 3 4254604. 2 Tel.: +886 7 525 2000; fax: +886 7 5254799. observation into the ‘bad’ risk class, this is similar to a bankruptcy prediction model that forecasts the new observation to be bankrupt. In other words, a ‘bad’ risk case can be simply regarded as the same as the ‘bankruptcy’ case. Related literature and studies have shown that machine learning techniques, such as neural networks outperform conventional statistical techniques including logistic regression, in terms of prediction accuracy and error [27,29]. In specific, combining multiple classification techniques or classifier ensembles perform far better than single classification techniques [17]. Generally speaking, classifier ensembles are based on training a fixed number of classifiers for the same domain problems (or the training sets), and the final output over a given unknown data sample can be obtained by combining the outputs made by the trained classifiers. In literature, bagging and boosting are the two widely used combination methods [17] (c.f. Section 3.2). Although many related studies have demonstrated the superiority of classifier ensembles over many single classifiers, most of them only constructed a specific type of classifier ensembles for bankruptcy prediction, such as neural network ensembles [13,29,31,33] and decision tree ensembles [1,26,32,35]. In addition, most of these classifier ensembles are only based on one specific combination method, i.e. either bagging or boosting (c.f. Section 3.3). Despite some previous works focus on comparing bagging and boosting meth- ods [5,19], where their findings show that the boosting method outperforms the bagging method, they conclude that the performances of classifier ensembles by bagging and boosting are usually domain dependent. Therefore, in the domain problems of bankruptcy prediction and credit scoring assessment there is no comparative study to assess the performances of a good http://dx.doi.org/10.1016/j.asoc.2014.08.047 1568-4946/© 2014 Elsevier B.V. All rights reserved.