Evolutionary Ensemble Approach
for Behavioral Credit Scoring
Nikolay O. Nikitin
(
✉
)
, Anna V. Kalyuzhnaya, Klavdiya Bochenina,
Alexander A. Kudryashov, Amir Uteuov, Ivan Derevitskii,
and Alexander V. Boukhanovsky
ITMO University, 49 Kronverksky Pr., St. Petersburg 197101, Russian Federation
nikolay.o.nikitin@gmail.com
Abstract. This paper is concerned with the question of potential quality of
scoring models that can be achieved using not only application form data but also
behavioral data extracted from the transactional datasets. The several model types
and a different configuration of the ensembles were analyzed in a set of experi‐
ments. Another aim of the research is to prove the effectiveness of evolutionary
optimization of an ensemble structure and use it to increase the quality of default
prediction. The example of obtained results is presented using models for
borrowers default prediction trained on the set of features (purchase amount,
location, merchant category) extracted from a transactional dataset of bank
customers.
Keywords: Credit scoring · Credit risk modeling · Financial behavior
Ensemble modeling · Evolutionary algorithms
1 Introduction
Scoring tasks and associated scoring models vary a lot depending on application area
and objectives. For example, application form-based scoring [1] is used by lenders to
decide which credit applicants are good or bad. Collection scoring techniques [2] are
used for segmentation of defaulted borrowers to optimize debts recovery, and profit
scoring approach [3] is used to estimate profit on specific credit product.
In this work, we consider scoring prediction problem for behavioral data in several
aspects. First of all, the set of experiments were conducted to determine the potential
quality of default prediction using different types of scoring models for the behavioral
dataset. Then, the possible impact of the evolutionary approach to improving the quality
of ensemble of different models by optimization of its structure was analyzed in compar‐
ison with the un-optimized ensemble.
This paper follows in Sect. 2 with a review of works in the same domain, in Sect. 3
we introduce the problem statement and the approaches for scoring task. Section 4
describes the dataset used as a case study and presents the conducted experiments. In
Sect. 5 we provide the summary of results; conclusion and future ways of increasing the
scoring model are placed in Sect. 6.
© Springer International Publishing AG, part of Springer Nature 2018
Y. Shi et al. (Eds.): ICCS 2018, LNCS 10862, pp. 825–831, 2018.
https://doi.org/10.1007/978-3-319-93713-7_81