Bulletin of Electrical Engineering and Informatics Vol. 14, No. 4, August 2025, pp. 3029~3036 ISSN: 2302-9285, DOI: 10.11591/eei.v14i4.9556 3029 Journal homepage: http://beei.org Feature selection for support vector machines in imbalanced data Borislava Toleva 1 , Ivan Ivanov 1 , Vincent Hooper 2 1 Faculty of Economics and Business Administration, Sofia University St. Kl. Ohridski, Sofia, Bulgaria 2 SP Jain Global School of Management, Academic City, Dubai, United Arab Emirates Article Info ABSTRACT Article history: Received Nov 14, 2024 Revised May 31, 2025 Accepted Jul 5, 2025 Addressing the effects of class imbalance on feature selection models has become an increasingly important focus in academic research. This study introduces a novel support vector machine (SVM)-based algorithm specifically designed to handle class imbalance during the feature selection process. Using the Taiwan bankruptcy dataset as a case study, the algorithm incorporates the ExtraTreeClassifier() to manage class imbalance and identify a reduced set of relevant variables. To validate the selected features, SVM is applied within the imbalanced data context. Subsequently, analysis of variance (ANOVA) ranking is employed to further refine the variable set to three key features. An SVM model tailored for class imbalance is then constructed to assess the effectiveness of the final feature set. The proposed model significantly outperforms existing approaches in terms of classification performance. Specifically, it achieves a Type I error of 1.17% and a Type II error of 22.9%, compared to 4.4% and 39.4% reported in prior research. In terms of overall accuracy, our method reaches 83.1%, surpassing the 81.3% achieved by earlier studies. These results demonstrate that the proposed feature selection algorithm not only improves SVM accuracy but also outperforms other feature selection techniques when used in conjunction with SVMs, particularly under conditions of class imbalance. Keywords: Analysis of variance Bankruptcy prediction Class imbalance Feature selection Support vector machines This is an open access article under the CC BY-SA license. Corresponding Author: Borislava Toleva Faculty of Economics and Business Administration, Sofia University St. Kl. Ohridski Sofia, Bulgaria Email: vrigazova@uni-sofia.bg 1. INTRODUCTION Feature selection has become a key focus in machine learning, offering a means to enhance algorithm quality by removing redundant features. This process not only improves algorithm efficiency but also aids in revealing a small group of factors that significantly impact an event. However, the effectiveness of feature selection algorithms relies on the fulfillment of certain assumptions by the data. For example, class imbalance in the target variable can lead to overfitting of the model. While much academic literature addresses handling class imbalance in the final classification stage, this research seeks to address it at an earlier stage: feature selection. We propose a new algorithm designed to handle class imbalance during feature selection, with a focus on improving the performance of support vector machines (SVM). Our algorithm aims to effectively identify a subset of features that not only improve SVM predictions but also help explore the connection among the independent variables and the target variable.