Variable Selection for Credit Risk Scoring on Loan Performance Using Regression Analysis Dawn Iris Calibo 1 , Melvin A. Ballera 2 Graduate Programs Technological Institute of the Philippines-Manila Manila, Philippines e-mail: dawniris_19@yahoo.com 1 ,melvin.ballera@tip.edu.ph 2 AbstractThe advancement of information and communication technology has accelerated developments in the field of credit management. This is reciprocated by the introduction of data analytics to process relevant information that could be useful specifically in financial granting decisions. With this, the researcher presents a research-in-progress of designing a risk analysis and recommendation system for the Department of Science and Technology VII Small & Medium Enterprise Technology Upgrading Program (DOST VII- SETUP). Its main feature is focused on credit risk analysis. To develop the application, selected variables to be used for credit scoring is identified based on the DOST Administrative Order No. 002 on Revised Small Enterprises Technology (SET-UP) Guidelines and 9-year historical data on granted loan projects from 2008-2016. With the use of tableau software, a data mining process was executed utilizing linear regression and trend model visualization for analysis. As the data on selected variables are validated, a proposed decision matrix on credit scoring has been developed. This leads to the recommendation on the development of the credit risk analysis and recommendation system by computing the center of gravity of each score through the fuzzy logic algorithm. Keywords-credit risk analysis; credit risk scoring; linear regression; loan performance; variable selection I. INTRODUCTION The global financial crisis had a particularly severe effect on developing countries like the Philippines. As the environment grew increasingly risky for businesses, small & medium enterprises (SMEs) needed to strengthen their financial positions to survive the market. This is where the government recognized the need for a source of funding that would continue to provide finance to firms during difficult times (Thorne, 2011) [1]. The Philippine government answer this challenge by strategizing ways to help SMEs through the Department of Science and Technology’s Small Enterprise Technology Upgrading Program (DOST-SETUP). The project is a nationwide strategy encouraging and assisting micro, small, and medium enterprises (MSMEs) to adopt technological innovations to improve their products, services, operations, increase their productivity and competitiveness, and allocate funding assistance. However, in financial granting the real challenge that the institution is facing is on severe information problems, both regarding moral vulnerability and poor selection (Soares, et Al, 2011) [2].In fact, it becomes progressively difficult for financial institutions to assess the risk of the borrowers and monitor their performances. This influences the adoption of financial techniques in credit appraisal process with a view to assessing the borrower’s business as well as financial position carefully. Credit risk scoring plays a significant role to measure the risk identification since a well-managed credit risk scoring system promotes safety and soundness by facilitating sound decision-making,Mohammad, et. Al. (2015) [3]. However, to be able to identify the credit risk scoring model, a variable selection process must first be made. This means that a collection of option model variables is tested for significance during model training. Thus, this paper aims to conduct a variable selection for credit risk scoring on loan performance using linear regression as visualized on a trend line model for the Department of Science and Technology Small & Medium Enterprise Technology Upgrading Program Region VII in Siquijor Province. This leads to the evaluation of the following contributing factors from a nine-year historical data of the project such debt on asset ratio (DAR), liability ratio (LR), net profit margin (NPM), return on investment (ROI), and year. II. LITERATURE REVIEW Credit risk analysis aims to assist a funding agency whether or not potential borrowers are ready to meet their credit responsibilities in view with written agreements. Where attainable, credit assessment procedures should embrace all knowledge and data relevant to creditworthiness, (Thoubauer, 2010) [4]. However, an enterprise’s ability to repay debt is decided by its capability to come up with money from operations, quality sales, or external monetary markets in more than its money needs. For financial granting institutions operating for years, a manual method for credit risk analysis becomes outdated. This has resultedin the adoption of recent developments. Specifically, data mining and analytics are among the tools known to be as a good methodology in looking for information that's “hidden” in organizations' databases. In the process of credit granting, the use of instruments that support that process is advantageous and may become a key factor in credit risk analysis. This includes the performance of the knowledge discovery process were data selection, data pre-processing and cleanup, data transformation, data mining, and the interpretation and evaluation of results, Sousa, et. Al. (2014) [5].The modern data analytics techniques, that have created a major 2019 IEEE 4th International Conference on Computer and Communication Systems 746 978-1-7281-1322-7/19/$31.00 ©2019 IEEE