Forecasting Tax Risk by Machine
Learning: Case of Firms in Ho Chi Minh
City
Nguyen Anh Phong
a, b,1
, Phan Huy Tam
a, b
and Le Quoc Cuong
a,b
a
University of Economics and Law, Vietnam
b
Vietnam National University, Ho Chi Minh City, Vietnam
Abstract. Tax is the main source of income for the State. However, managing tax
collection effectively and limiting the tax risks is a challenge for state tax authorities.
This study applies machine learning to assess and predict firms with tax risks using
logistic regression algorithm. The data set includes 872 observations of firms in
Vietnam market. The machine learning approach is used to classifies the firms into
2 categories which has tax risk or not based on 6 main factors: (i) revenue and other
income; (ii) expenses; (iii) liquidity; (iv) asset; (v) liabilities; and (vi) equity. The
results show that the machine learning method is effective and accurate in
identifying and predicting risks in tax declaration. The authors recommend that the
tax agencies could apply machine learning methods and go further with big data and
artificial intelligence approach to identify and classify enterprises.
Keywords. tax risk, machine learning, enterprises
1. Introduction
Tax is the main source of state revenue and a major contributor to the national budget to
ensure the expenditure of the State, an important tool for redistributing gross social
product and national income [1]. Thus, it could be seen that tax is an economic measure
of every state. Because of that importance, the tax system reform strategy for the period
2011-2020 was approved by the Government, along with the completion and
development of new laws on tax policy, the Law on Tax Administration was promulgated,
amending and supplementing to meet the requirements of socio-economic development
and international economic integration, establishing a common legal framework and
uniformly applying it in the process of implementing all tax policies, overcome the
situation of separation in management methods among taxes, creating a foundation for
the self-declaration and self-payment mechanism.
However, due to the rapid increase in workload and the number of taxpayers, the
self-declaration management mechanism is facing many risks such as the risk of
1
Corresponding Author, Nguyen Anh Phong, Faculty of Finance and Banking, University of Economics
and Law, VietNam National University, Ho Chi Minh City, Vietnam. Email: phongna@uel.edu.vn
JEL Classification Code: C01, C81, G38, H11
This research is funded by Vietnam National University Ho Chi Minh City (VNU-HCM) under grant number
DS2022-34-03
Fuzzy Systems and Data Mining VIII
A.J. Tallón-Ballesteros (Ed.)
© 2022 The authors and IOS Press.
This article is published online with Open Access by IOS Press and distributed under the terms
of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
doi:10.3233/FAIA220371
66