2024 4 th International Conference on Mobile Networks and Wireless Communications (ICMNWC) 979-8-3315-2834-8/24/$31.00 ©2024 IEEE Deep Learning to predict the Success of Startup Companies based on Smote and XB-RFE Model 1 st Manikanth Sarisa Principal Software Engineer Ally Financial INC manikanthsarisa@outlook.com 2 nd Gagan Kumar Patra Tata Consultancy Services Senior Solution Architect gagankumarpatra12@outlook.com 3 rd Chandrababu Kuraku Mitaja Corportaion Senior Solution Architect ChandrababuKuraku@outlook.com 4 th Siddharth Konkimalla Adobe INC Sr Network Development Engineer SiddharthKonkimalla@outlook.com 5 th Shravan Kumar Rajaram Microsoft Sr. Technical Support Engineer- Networking ShravanKumarRajaram@outlook.com 6 th Mohit Surender Reddy Microsoft Sr. Technical Support Engineer- Networking mohitsurenderreddy@outlook.com Abstract—The success of these high-risk startups can provide significant returns to venture capital businesses, and startups in general are a major factor for economic development. Investors can get a significant advantage over their competition if they can accurately forecast the success of startups. The overarching goal of this study is to identify the critical success factors for new businesses and to develop a model to categorize start-ups.Preprocessing, feature extraction, and model training make up the proposed method. Information cleansing, missing value handling, and N-gram analysis are all components of preprocessing. Subject Recognition Feature extraction makes use of the LDA algorithm. When training the model, we opted for the SMOTE-XGBoost-RFE. Our suggested model achieves an average accuracy rate of 90.35 percent, which is better than state-of-the-art alternatives such as RFE and XGBoost. Keywords—extreme gradient boosting (XGBoost), startup companies, recursivefeature elimination (RFE) I. INTRODUCTION Startup businesses are crucial to the modern economy. Startups and smaller businesses, in comparison to larger, more established enterprises, have a higher rate of job generation. Innovation and growth in technology are propelled by new businesses because they bring fresh ideas and healthy competition to a sector. Because of these factors, new business initiatives are an ideal time to research. Working in the startup environment can be extremely dangerous and cutthroat. During their first three years in business, fewer than 60% of startups fail. A key component of success is acquiring enough funds to keep and grow a firm. The challenging but ultimately gratifying objective of discovering early-stage profitable enterprises was our starting point in writing this piece. This way, we can assist these companies in identifying their weak spots and give the investors who are supporting them an advantage[1]. Despite the obvious practical importance of the matter, neither the academic nor the financial communities have raised it. An organization's founders and early employees are its most important stakeholders. They can preserve valuable resources (human, monetary, etc.) and make informed decisions about how to focus their efforts with the help of prediction models, which improves the chances of success or failure for their business ideas. Investors in companies also play an important role; perhaps these prediction models will improve their track record of success. Lastly, everyone has a stake in the success or failure of a startup and should be informed about the results. All parties involved, from suppliers who will need to set up or manage new procedures for the supply chain to clients and customers who may rely on the new product or service, are involved[2]. The complex and risky environment in which startups originate and expand necessitates the consideration of numerous internal and extrinsic variables when building a prediction model. Even more challenging for new businesses is resolving the issue without supporting financial or operational records. The available data is, to put it bluntly, qualitative and comes from a range of sources, and it is scattered. Startups are an important part of the modern world's economic infrastructure because they encourage new ideas, healthy competition, and the production of new employment opportunities. Startups are usually born out of the desire of enterprising individuals to address a market need by creating a product or service, sometimes using technology, to meet that need. Due of their limited resources, startups often require substantial funding to carry out R&D, marketing, hiring, and client acquisition. Funding is crucial for startups. Venture capital, angel investors, and government grants are all potential funding and equity contributions that might alter a startup's course of development. The role of venture capital (VC) investors is to fill the void that occurs during commercialization[3]. Venture capital is a form of private equity finance commonly extended to startups and growing companies that exhibit indications of quick growth, such as an increase in employees, annual revenue, operational size, etc. A startup's ability to raise capital is sometimes heavily dependent on the numerous funding rounds it undergoes. Early rounds and late rounds are the two main components of a typical fundraising process. Seed money and other forms of early-stage funding assist a firm get from the brainstorming stage all the way to the point where it starts to make money. Based on numerous essential characteristics, such as the geography of enterprises, funding rounds, and factors leading to the success or failure of the company, the proposed work will examine the viability of utilizing DL approaches to forecast the outcome of startups. Using data obtained through the Kaggle websites from the Crunchbase database, we also compare the performance of several machine learning models. Improving our recommendations to investors is possible by comparing and contrasting the outcomes of various models. Upon completing the literature study, the research technique is detailed. The analysis and results are then presented. The paper concludes with some recommendations for future efforts and draws conclusions. II. LITERATURE SURVEY It has been suggested in scholarly works that there are a number of ways to predict how well new companies will 2024 4th International Conference on Mobile Networks and Wireless Communications (ICMNWC) | 979-8-3315-2834-8/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICMNWC63764.2024.10872045 Authorized licensed use limited to: Alliance University. Downloaded on February 21,2025 at 11:11:47 UTC from IEEE Xplore. Restrictions apply.