1 Variable Selection Method Affects SVM Approach in Bankruptcy Prediction Chih-Hung Wu 1 Wen-Chang Fang 2 Yeong-Jia Goo 3 1 Department of Business Administration, Takming College, Taipei, Taiwan 2,3 Department of Business Administration, National Taipei University, Taipei, Taiwan ABSTRACT This paper examined bankruptcy predictive accuracy of five statistics models-- discriminant analysis logistic regression, probit regression, neural networks, support vector machine (SVM), and genetic-based SVM (GA-SVM) that influenced by variable selection. Empirical results indicate that the SVM-based models are very promising models for predicting financial failure, in terms of both best predictive accuracy and generalization ability. In addition, variable selection had the lowest influence of predictive accuracy in the GA-SVM model with optimal values of parameters. Keywords: Variable Selection, Bankruptcy Prediction, Support vector machine. 1. Introduction Predicting corporate failure has been an important research topic in accounting and finance for the last three decades (Lee, Han, and Kwon, 1996). Previous application of neural networks in finance and accounting, notably in bankruptcy prediction, are limited to back-propagation neural networks (Yang, Platt, and Platt, 1999). Recently, new algorithms in learning machine, Support vector machines (SVMs), were developed by Boser, Guyon, and Vapnik (1992) to provide better solutions to decision boundary than could be obtained using the traditional neural network. Since the new model was proposed (Boser, Guyon, and Vapnik, 1992; Cortes and Vapnik, 1995), SVM has been successfully applied to numerous applications, including the handwriting recognition, particle identification (e.g. muons), digital images identification, text categorization, bioinformatics, function approximation and regression, and database marketing. Although SVMs have become more widely used to time series forecasting and dynamically reconstruct of chaotic systems. However, few articles have been devoted to the study of analyzing the power of variable selection to influence SVM-based models on problems of finance prediction. Consequently, this study analyzed the bankruptcy predictive accuracy of five various statistics models--discriminant analysis, logistic regression, probit regression, neural networks, support vector machines (SVM), and genetic-based SVM (GA-SVM) that influenced by variable selection. 2. Overview of methodologies for predicting bankruptcy The corporate distress literature includes several diverse methodologies for discriminating between failed and non-failed firms, following Beaver’s univariate comparison of financial ratios in 1966. Extensive studies in this area have applied statistical and AI approaches over the last three decades. The well-known multivariate models used in this area include multiple discriminate analysis (MDA) (Altman, 1968; Altman, Haldeman, and Narayanan, 1977), regression modeling (Edmister, 1972), logit analysis (Ohlson, 1980; Platt and Platt, 1990), and probit analysis (Zmijewski, 1984). Most recently, AI approaches, such as neural network approaches have shown promise as classification tools (Odom and Sharda, 1990; Berry and Treigueiros, 1991; Coakley and Brown, 1991; Raghupathi, Schkade, and Raju, 1991; Lee, Han, and Kwon, 1996; Yang, Platt, and Platt, 1999). Apart from abovementioned methodologies, SVM is herein extended to model the financial distress classification problem. 3. Research Design 3.1 Research Data Financial-statement data of the failed and non-failed firms were obtained from the database of the Taiwan Economic Journal (TEJ), covering in cases of three years prior to failure