Research Article Identification of Dry Bean Varieties Based on Multiple Attributes Using CatBoost Machine Learning Algorithm S. Krishnan , 1 S. K. Aruna , 2 Karthick Kanagarathinam , 3 and Ellappan Venugopal 4 1 Department of , Mahendra ngineering College (Autonomous), Namakkal, Tamil Nadu, India 2 Department of Computer Science and ngineering, School of ngineering and Technology, CHRIST (Deemed to be University), Bangalore, Karnataka, India 3 Department of lectrical and lectronics ngineering, GMR Institute of Technology, Rajam, Andhra Pradesh, India 4 Department of lectronics and Communication ngineering, School of lectrical ngineering and Computing, Adama Science and Technology University, Adama, thiopia Correspondence should be addressed to Ellappan Venugopal; ellappan.venugopal@astu.edu.et Received 12 December 2022; Revised 11 February 2023; Accepted 3 March 2023; Published 21 April 2023 Academic Editor: Sadiq Hussain Copyright © 2023 S. Krishnan et al. Tis is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Dry beans are the most widely grown edible legume crop worldwide, with high genetic diversity. Crop production is strongly infuenced by seed quality. So, seed classifcation is important for both marketing and production because it helps build sus• tainable farming systems. Te major contribution of this research is to develop a multiclass classifcation model using machine learning (ML) algorithms to classify the seven varieties of dry beans. Te balanced dataset was created using the random undersampling method to avoid classifcation bias of ML algorithms towards the majority group caused by the unbalanced multiclass dataset. Te dataset from the UCI ML repository is utilised for developing the multiclass classifcation model, and the dataset includes the features of seven distinct varieties of dried beans. To address the skewness of the dataset, a Box•Cox transformation (BCT) was performed on the dataset’s attributes. Te 22 ML classifcation algorithms have been applied to the balanced and preprocessed dataset to identify the best ML algorithm. Te ML algorithm results have been validated with a 10•fold cross•validation approach, and during validation, the CatBoost ML algorithm achieved the highest overall mean accuracy of 93.8 percent, with a range of 92.05 percent to 95.35 percent. 1. Introduction People eat dry beans, which are a type of legume that is self• pollinated. Beans are a signifcant crop on a global scale and are popular with both farmers and consumers. Dry beans account for nearly 50 percent of the grain legumes consumed directly by humans in the majority of developing countries 1]. Beans are a staple food in Sub•Saharan Africa, where they are consumed by more than 200 million people 2]. A system of quality control makes sure that approved seed meets national and global quality benchmarks. For the majority of food products, visual characteristics are the primary criterion used by consumers when making pur• chasing decisions 3]. Like other legume species, common beans show the most variation in terms of growth patterns, physical features (size, shape, and shading), maturity, and ability to grow and adapt 4, 5]. Sorting and classifying bean seeds manually is a time•consuming process. Additionally, this method is inefcient and tedious, particularly when working with large production volumes. Human inspectors are usually in charge of checking raw materials, and it is difcult to streamline the inspectors’ fndings. Tese con• siderations reafrm the importance of objective measure• ment systems. As a result, automatic grading and classifcation methods are required. Recent technological changes have helped researchers in this feld a lot. Computer vision systems (CVSs) are being used for quality control and have recently begun to be used as an objective measurement and evaluation system 6–9]. CVS technology, which is primarily camera cum computer Hindawi Scientific Programming Volume 2023, Article ID 2556066, 21 pages https://doi.org/10.1155/2023/2556066