Evolutionary Ensemble Approach for Behavioral Credit Scoring Nikolay O. Nikitin ( ✉ ) , Anna V. Kalyuzhnaya, Klavdiya Bochenina, Alexander A. Kudryashov, Amir Uteuov, Ivan Derevitskii, and Alexander V. Boukhanovsky ITMO University, 49 Kronverksky Pr., St. Petersburg 197101, Russian Federation nikolay.o.nikitin@gmail.com Abstract. This paper is concerned with the question of potential quality of scoring models that can be achieved using not only application form data but also behavioral data extracted from the transactional datasets. The several model types and a diﬀerent conﬁguration of the ensembles were analyzed in a set of experi‐ ments. Another aim of the research is to prove the eﬀectiveness of evolutionary optimization of an ensemble structure and use it to increase the quality of default prediction. The example of obtained results is presented using models for borrowers default prediction trained on the set of features (purchase amount, location, merchant category) extracted from a transactional dataset of bank customers. Keywords: Credit scoring · Credit risk modeling · Financial behavior Ensemble modeling · Evolutionary algorithms 1 Introduction Scoring tasks and associated scoring models vary a lot depending on application area and objectives. For example, application form-based scoring [1] is used by lenders to decide which credit applicants are good or bad. Collection scoring techniques [2] are used for segmentation of defaulted borrowers to optimize debts recovery, and proﬁt scoring approach [3] is used to estimate proﬁt on speciﬁc credit product. In this work, we consider scoring prediction problem for behavioral data in several aspects. First of all, the set of experiments were conducted to determine the potential quality of default prediction using diﬀerent types of scoring models for the behavioral dataset. Then, the possible impact of the evolutionary approach to improving the quality of ensemble of diﬀerent models by optimization of its structure was analyzed in compar‐ ison with the un-optimized ensemble. This paper follows in Sect. 2 with a review of works in the same domain, in Sect. 3 we introduce the problem statement and the approaches for scoring task. Section 4 describes the dataset used as a case study and presents the conducted experiments. In Sect. 5 we provide the summary of results; conclusion and future ways of increasing the scoring model are placed in Sect. 6. © Springer International Publishing AG, part of Springer Nature 2018 Y. Shi et al. (Eds.): ICCS 2018, LNCS 10862, pp. 825–831, 2018. https://doi.org/10.1007/978-3-319-93713-7_81