European Journal of Operational Research 249 (2016) 517–524
Contents lists available at ScienceDirect
European Journal of Operational Research
journal homepage: www.elsevier.com/locate/ejor
Spatial dependence in credit risk and its improvement in credit scoring
Guilherme Barreto Fernandes
a,b,∗
, Rinaldo Artes
a
a
Insper Institute of Education and Research, Rua Quatá, 300, Vila Olímpia, São Paulo, Brazil
b
Serasa Experian, Alameda dos Quinimuras, 187, Planalto Paulista, CEP 04068-900 São Paulo, Brazil
article info
Article history:
Received 15 March 2014
Accepted 6 July 2015
Available online 29 July 2015
Keywords:
Risk analysis
Spatial dependence
SME credit risk
Ordinary kriging
Credit scoring
abstract
Credit scoring models are important tools in the credit granting process. These models measure the credit
risk of a prospective client based on idiosyncratic variables and macroeconomic factors. However, small and
medium sized enterprises (SMEs) are subject to the effects of the local economy. From a data set with the
localization and default information of 9 million Brazilian SMEs, provided by Serasa Experian (the largest
Brazilian credit bureau), we propose a measure of the local risk of default based on the application of ordinary
kriging. This variable has been included in logistic credit scoring models as an explanatory variable. These
models have shown better performance when compared to models without this variable. A gain around 7
percentage points of KS and Gini was observed.
© 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the
International Federation of Operational Research Societies (IFORS). All rights reserved.
1. Introduction
The correct evaluation of credit risk is an important issue of the
Basel agreements. In this context, the probability of default (PD) has
a central role. Statistical and mathematical models have been widely
employed in order to estimate the PD for companies or contracts.
These models, called credit scoring models usually determine the risk
of default conditionally to exogenous factors. The Basel agreements
require conservative estimates of PD for loan portfolios, and retail
customers – such as small and medium sized enterprises (SMEs) –
must be addressed under the perspective of a massive risk evaluation
by means of statistical models. In the present paper logistic models
(Hosmer & Lemeshow, 2000) will be used to predict the PD of SMEs.
Information on payment history and financial capacity are natu-
rally understood as relevant risk factors in these models. It also seems
to be reasonable to assume that the firm location adds information to
credit scoring models, particularly to those aimed to predict default
risk of SMEs. Oftentimes the main customers of these firms are the
population and other companies located in the region where they
operate. Thus, when considering an SME located in a region that is
facing an economic downturn, affecting the performance of nearby
businesses, the risk of default of this firm is expected to increase.
In principle, the need of the inclusion of a spatial factor in credit
scoring models could be replaced by characteristics of the local econ-
omy. However, information gathering is very difficult when the area
∗
Corresponding author at: Insper and Serasa Experian Alameda dos Quinimuras,
187 Analytics São Paulo, SP - Brazil. Tel.: +55 11 98642 8017; fax: +55 11 3805 4168.
E-mail address: gbfernandes2002@gmail.com (G.B. Fernandes).
of investigation is big – once information on small localities in those
regions can be rather scarce or unavailable. Similar problems were
verified by Gerkman (2011) in a study of real estate prices.
In this context, the analysis of spatial dependence is justified in a
comprehensive study on the credit risk of SMEs; few studies on credit
scoring, however, consider this effect. The aim of this paper is to in-
corporate information on default spatial behavior into credit scoring
models for SMEs.
The use of an independent ZIP code related variable is a classical
alternative to introduce spatial information into credit scoring mod-
els. However, it is a qualitative variable with potentially large number
of categories, which produces a non-parsimonious model and brings
the risk of a multicollinearity problem. Moreover, regions with few
individuals would not have good risk assessment. The large num-
ber of ZIP-code categories can produce an overfitting effect and may
make the model unstable over time. Finally, economic phenomena do
not necessarily respect this territorial division.
In this paper, the spatial dependence is considered by the inclu-
sion of a quantitative variable in the model – which may be con-
sidered a measure of spatial risk of default – obtained by ordinary
kriging (Matheron, 1963). This risk factor is used as an explanatory
variable in logistic credit scoring models. Two different alternatives
for the inclusion of this factor in the logistic model have been consid-
ered. The first, and simplest one, is to consider it as a fixed variable
(without measurement error). The other is to admit that the observed
value,
ˆ
Z, is, in fact, a proxy of an unobservable variable that expresses
the spatial risk factor (τ ) such that
ˆ
Z = τ + ε, where ε is a random
error of measurement (logistic model with errors in variables) (Clark,
1982).
http://dx.doi.org/10.1016/j.ejor.2015.07.013
0377-2217/© 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the International Federation of Operational Research Societies (IFORS).
All rights reserved.