Journal of critical reviews 815 Journal of Critical Reviews ISSN- 2394-5125 Vol 7, Issue 4, 2020 Review Article MACHINE LEARNING METHODOLOGY FOR MEDICAL DATA ANALYSIS FOR PREDICTION OF RISK G.L. Sravanthi 1 , N. Harika 2 , G. Archana 3 , B. Sundara Leela 4 1 Asst. Professor, CSE Department, Vignan Nirula Institute of Technology & Science for Women, Peda Palakaluru, Guntur, Andhra Pradesh, India. glsravanthi88@gmail.com 2 Asst. Professor, CSE Department, Vignan Nirula Institute of Technology & Science for Women, Peda Palakaluru, Guntur, Andhra Pradesh, India. narraharika4@gmail.com 3 Asst. Professor, CSE Department, Vignan Nirula Institute of Technology & Science for Women, Peda Palakaluru, Guntur, Andhra Pradesh, India. archu.gunakala@gmail.com 4 Asst.Professor, JNTUK, Andhra Pradesh, India. sundaraleela.b@gmail.com Received: 19.12.2019 Revised: 22.01.2020 Accepted: 24.02.2020 Abstract Mining data is a nontrivial procedure of finding information from a large volume of data. Such information can be helpful in settling on significant choices. Medical data show special features including noise coming about because of human just as methodical blunders, missing qualities and even meager conditions. The nature of data has huge ramifications for the nature of the mining results. Medical data classification is important to perform preprocessing steps so as to expel or at least lighten a portion of the issues related with medical data. Clustering is a descriptive-based data mining task. The clustering algorithm is also called as unsupervised learning algorithm that learns the unlabeled dataset and groups or clusters the instances based on their similarity and builds the clustering model. Clustering is same as classification in which data is grouped, but in this, groups are not predefined. In clustering, clusters are not predefined. Classification of different types of clustering is as follows: Hierarchical clustering, Partition clustering, Categorical clustering, Density based clustering and Grid based clustering. The main intension of the research is to classify the medical data with high accuracy value. In order to achieve promising results, a novel data classification methods have been designed that utilize a Improved Cluster Optimal Classifier (ICOC). The proposed method is compared with traditional methods and the results show that the proposed method performance is better and accurate. Keywords: Medical Data Classification, Machine Learning Methodologies, Data Mining, Data Classification, Cluster Based Classification. © 2019 by Advance Scientific Research. This is an open-access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/) DOI: http://dx.doi.org/10.31838/jcr.07.04.151 INTRODUCTION Classification is the way toward grouping a data thing into one of the predefined classes. Two steps are to be followed in Classification process [1]. It includes examining the features of a recently displayed object and appointing to it a predefined class. Initial a model is constructed portraying a predefined arrangement of data classes or ideas [2]. Preparing data are utilized to construct the model. Furthermore, the model is utilized for classification. Figure 1.6 represents the Schematic Representation of Classification Classification is a predictive-based data mining task. In order to accomplish the classification task, the classification algorithm is used to learn the dataset and to build the classifier [15]. The dataset contains a set of features (columns) and instances (rows) with a target-class attribute which contains the class-label associated with each instance of the dataset [16]. The unlabeled data is given to the classifier in order to predict the class-label of the unlabeled instance [17]. Classification can be performed on structured or unstructured data [18]. The goal of classification is to identify the category where new data comes under. Fig 1: Schematic Classification Representation A data classifier requires a choice of features that must be custom fitted independently for different issues [19] [20]. Following component determination, classifier improvement requires detachment of the data into preparing and test data and experiences two noteworthy periods of data classifier development as shown in the Figure 2.