IAES International Journal of Artificial Intelligence (IJ-AI) Vol. 10, No. 2, June 2021, pp. 407~413 ISSN: 2252-8938, DOI: 10.11591/ijai.v10.i2.pp407-413 407 Journal homepage: http://ijai.iaescore.com Estimating probability of banking crises using random forest Sri Hartini 1 , Zuherman Rustam 2 , Glori Stephani Saragih 3 , María Jesús Segovia Vargas 4 1,2,3 Department of Mathematics, Universitas Indonesia, Depok, 16424, Indonesia 4 Department of Financial Economy and Accounting I, Universidad Complutense de Madrid, Madrid, 28223, Spain Article Info ABSTRACT Article history: Received Mar 9, 2020 Revised Mar 2, 2021 Accepted Apr 16, 2021 Banks have a crucial role in the financial system. When many banks suffer from the crisis, it can lead to financial instability. According to the impact of the crises, the banking crisis can be divided into two categories, namely systemic and non-systemic crisis. When systemic crises happen, it may cause even stable banks bankrupt. Hence, this paper proposed a random forest for estimating the probability of banking crises as prevention action. Random forest is well-known as a robust technique both in classification and regression, which is far from the intervention of outliers and overfitting. The experiments were then constructed using the financial crisis database, containing a sample of 79 countries in the period 1981-1999 (annual data). This dataset has 521 samples consisting of 164 crisis samples and 357 non- crisis cases. From the experiments, it was concluded that utilizing 90 percent of training data would deliver 0.98 accuracy, 0.92 sensitivity, 1.00 precision, and 0.96 F1-Score as the highest score than other percentages of training data. These results are also better than state-of-the-art methods used in the same dataset. Therefore, the proposed method is shown promising results to predict the probability of banking crises. Keywords: Banking crises Machine learning Prediction of banking crises Probability of banking crises Random forest Random forest regression This is an open access article under the CC BY-SA license. Corresponding Author: Sri Hartini Department of Mathematics Universitas Indonesia Depok, 16424, Indonesia Email: sri.hartini@sci.ui.ac.id 1. INTRODUCTION Banking crises are costly to the economic crisis, not only because of the high direct cost saving but also because of adverse effects on the economy [1]. Crisis in banks can significantly lessen the global economic growth by slowing down economic activities, limiting the number of stable currencies (exchange rate) for emerging economies, and weighing on their capacity to pay their debts [2]. Banking stability itself continues to receive heightened attention since the global financial crisis, circa 2008. Notwithstanding the importance of these preventative efforts, it is essential to understand the effectiveness of banking sector stability as a buffer to the real economy, when crises do occur [3]. Much of this attention has been focused on the prevention of future systemic crises. Statistical method is a standard approach to estimate the probability of banking crisis, but low accuracy has been obtained by some standpoints of statistical techniques and the disadvantage of statistical method is we should follow the assumption of each method, if the data is not fitted to the assumptions, we cannot used the method. Then, by the devolepment of computational method, one of the methods to prevent banking crises that reaserchers usually used is machine learning. Several types of research have proved that machine learning could be a tool in predicting the banking crisis. Beutel et al. [4] suggested that further enhancements to machine learning early warning models are needed before they are able to offer a substantial