International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-9 Issue-6, March 2021 39 Published By: Blue Eyes Intelligence Engineering and Sciences Publication Retrieval Number: 100.1/ijrte.F5331039621 DOI:10.35940/ijrte.F5331.039621 An Optimization of Feature Selection for Classification using Bat Algorithm V. Yasaswini, Santhi Baskaran Abstract Data mining is the action of searching the large existing database in order to get new and best information. It plays a major and vital role now-a-days in all sorts of fields like Medical, Engineering, Banking, Education and Fraud detection. In this paper Feature selection which is a part of Data mining is performed to do classification. The role of feature selection is in the context of deep learning and how it is related to feature engineering. Feature selection is a preprocessing technique which selects the appropriate features from the data set to get the accurate result and outcome for the classification. Nature- inspired Optimization algorithms like Ant colony, Firefly, Cuckoo Search and Harmony Search showed better performance by giving the best accuracy rate with less number of features selected and also fine f-Measure value is noted. These algorithms are used to perform classification that accurately predicts the target class for each case in the data set. We propose a technique to get the optimized feature selection to perform classification using Meta Heuristic algorithms. We applied new and recent advanced optimized algorithm named Bat algorithm on UCI datasets that showed comparatively equal results with best performed existing firefly but with less number of features selected. The work is implemented using JAVA and the Medical dataset (UCI) has been used. These datasets were chosen due to nominal class features. The number of attributes, instances and classes varies from chosen dataset to represent different combinations. Classification is done using J48 classifier in WEKA tool. We demonstrate the comparative results of the presently used algorithms with the existing algorithms thoroughly. Index Terms: Optimization, Meta-heuristic, Feature Extraction, Deep learning I. INTRODUCTION Data Mining [1] is the way of searching important information from the huge present all over in the repository. Data Mining falls in to two ways namely Association and Classification analyzing methods. Optimization algorithm provides a systematic way of developing and leveling new solutions to gain an optimal result. The optimization process must only be used in those problems where there is a specific need of accomplishing a quality or a competitive work. It is expected that the solution obtained through an optimization method is better than other results in terms of the selected objective. This paper shows the Bat algorithm and Modified Bat algorithm accuracy rates when compared to existing algorithms namely Firefly, Cuckoo search and Harmony Search algorithms that showed almost equal results of the best accuracy rates in existing work. Manuscript received on January 30, 2020. Revised Manuscript received on February 11, 2021. Manuscript published on March 30, 2021. V. Yasaswini, Research Scholar, Computer Science and Engineering Department, Pondicherry Engineering College, Puducherry, India. Santhi Baskaran, Professor& Head, Information Technology Department, Pondicherry Engineering College, Puducherry, India There are various applications with respect to data mining and optimization techniques in different fields. This method proves the better analysis which gives the best results and improved accuracy. The following are the different field of applications. 1. Network Security 2. Computer Vision and Processing 3. Nature Inspired fields. 4. Medical Fields 5. Transition Probabilities for Radio Systems 6. Intrusion Detection 7. Education 8. Financial Banking II. OVERVIEW ON DATAMINING Data mining process involves the following stages. a) Problem Definition. In this stage the analysis of the problem in the business problem is done and tries to get the clear idea of the problem to be solved. This takes some time to make an exact definition of the problem and it does not require any data tools. b) Exploration of Data. In this stage data is explored by identifying quality problem to understand the metadata meaning. It is next level of problem definition stage which frequently exchanges the data. c) Preparation of Data. In this stage data model is built after the exploration of data. Data is collects, clear the unwanted data and arrange the data in a format like tables and records. d) Data Modeling. At this stage after preparation of information, different mining functions are applied to the same kind of data. A high quality of mining model is prepared based on the changes in the parameters until we get optimal data model. Finally the good quality model is built and evaluated. e) Evaluation of the Model. In this stage the evaluated model is checked and tested whether the quality is good or not and objective is satisfied or not? f) Deployment. In this stage after the evaluation of data, the exporting of the data is done and the results are checked into database tables.