Forecasting Tax Risk by Machine Learning: Case of Firms in Ho Chi Minh City Nguyen Anh Phong a, b,1 , Phan Huy Tam a, b and Le Quoc Cuong a,b a University of Economics and Law, Vietnam b Vietnam National University, Ho Chi Minh City, Vietnam Abstract. Tax is the main source of income for the State. However, managing tax collection effectively and limiting the tax risks is a challenge for state tax authorities. This study applies machine learning to assess and predict firms with tax risks using logistic regression algorithm. The data set includes 872 observations of firms in Vietnam market. The machine learning approach is used to classifies the firms into 2 categories which has tax risk or not based on 6 main factors: (i) revenue and other income; (ii) expenses; (iii) liquidity; (iv) asset; (v) liabilities; and (vi) equity. The results show that the machine learning method is effective and accurate in identifying and predicting risks in tax declaration. The authors recommend that the tax agencies could apply machine learning methods and go further with big data and artificial intelligence approach to identify and classify enterprises. Keywords. tax risk, machine learning, enterprises 1. Introduction Tax is the main source of state revenue and a major contributor to the national budget to ensure the expenditure of the State, an important tool for redistributing gross social product and national income [1]. Thus, it could be seen that tax is an economic measure of every state. Because of that importance, the tax system reform strategy for the period 2011-2020 was approved by the Government, along with the completion and development of new laws on tax policy, the Law on Tax Administration was promulgated, amending and supplementing to meet the requirements of socio-economic development and international economic integration, establishing a common legal framework and uniformly applying it in the process of implementing all tax policies, overcome the situation of separation in management methods among taxes, creating a foundation for the self-declaration and self-payment mechanism. However, due to the rapid increase in workload and the number of taxpayers, the self-declaration management mechanism is facing many risks such as the risk of 1 Corresponding Author, Nguyen Anh Phong, Faculty of Finance and Banking, University of Economics and Law, VietNam National University, Ho Chi Minh City, Vietnam. Email: phongna@uel.edu.vn JEL Classification Code: C01, C81, G38, H11 This research is funded by Vietnam National University Ho Chi Minh City (VNU-HCM) under grant number DS2022-34-03 Fuzzy Systems and Data Mining VIII A.J. Tallón-Ballesteros (Ed.) © 2022 The authors and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0). doi:10.3233/FAIA220371 66