mathematics
Article
A Robust Approach for Identifying the Major Components of
the Bribery Tolerance Index
Daniel Homocianu
1
, Aurelian-Petrus
,
Plopeanu
2
and Rodica Ianole-Calin
3,
*
Citation: Homocianu, D.; Plopeanu,
A.-P.; Ianole-Calin, R. A Robust
Approach for Identifying the Major
Components of the Bribery Tolerance
Index. Mathematics 2021, 9, 1570.
https://doi.org/10.3390/math9131570
Academic Editor: David Carfì
Received: 14 June 2021
Accepted: 1 July 2021
Published: 3 July 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
Department of Accounting, Business Information Systems and Statistics, Faculty of Economics and Business
Administration, Alexandru Ioan Cuza University of Iasi, 700505 Ia¸ si, Romania; daniel.homocianu@uaic.ro
2
Humanities and Social Sciences Research Department, Institute of Interdisciplinary Research, Alexandru Ioan
Cuza University of Iasi, 700107 Ia¸ si, Romania; aplopeanu@gmail.com
3
Faculty of Administration and Business, University of Bucharest, 030018 Bucharest, Romania
* Correspondence: rodica.ianole@faa.unibuc.ro
Abstract: The paper aims to emphasize the advantages of several advanced statistical and data mining
techniques when applied to the dense literature on corruption measurements and determinants. For
this purpose, we used all seven waves of the World Values Survey and we employed the Naive
Bayes technique in SQL Server Analysis Services 2016, the LASSO package together with logit and
melogit regressions with raw coefficients in Stata 16. We further conducted different types of tests and
cross-validations on the wave, country, gender, and age categories. For eliminating multicollinearity,
we used predictor correlation matrices. Moreover, we assessed the maximum computed variance
inflation factor (VIF) against a maximum acceptable threshold, depending on the model’s R squared
in Ordinary Least Square (OLS) regressions. Our main contribution consists of a methodology for
exploring and validating the most important predictors of the risk associated with bribery tolerance.
We found the significant role of three influences corresponding to questions about attitudes towards
the property, authority, and public services, and other people in terms of anti-cheating, anti-evasion,
and anti-violence. We used scobit, probit, and logit regressions with average marginal effects to
build and test the index based on these attitudes. We successfully tested the index using also risk
prediction nomograms and accuracy measurements (AUCROC > 0.9).
Keywords: bribery tolerance index; Naive Bayes; LASSO; maximum acceptable VIF; correlation
matrices; cross-validations; minimum accuracy loss; mixed-effects; average marginal effects; risk
prediction nomograms
1. Introduction
The current massive increase in data about people’s attitudes and behaviors raises
both opportunities and challenges for economics and social sciences, on different levels [1].
One major area of innovation is reflected in the advanced statistical methodologies used to
capture as accurately as possible the most relevant and actionable insights for private and
public use [2]. In this spirit, there is a growing tendency to define comprehensive measures
which are able to integrate various aspects of individual behaviors or socio-economic
phenomena (e.g., development [3], poverty [4], and sustainability [5]). Under this um-
brella, the use of composite indices appears as a common practice, with a high degree of
heterogeneity concerning the many different computational techniques employed to obtain
them. Namely, they vary from additive approaches (e.g., the tax morale index [6]) and
ad-hoc selection of variables to more complex procedures, like principal component analy-
sis and selection techniques using different correlation coefficients (e.g., the sustainable
development index for European economies [7]).
As reported by [8], even if there are many available methods for variable selection
(ridge or partial least-squares regressions [9]), the least absolute shrinkage and selection
operator (LASSO) regression is desirable because it ensures sparsity of coefficients and
Mathematics 2021, 9, 1570. https://doi.org/10.3390/math9131570 https://www.mdpi.com/journal/mathematics