Journal of critical reviews 815
Journal of Critical Reviews
ISSN- 2394-5125 Vol 7, Issue 4, 2020
Review Article
MACHINE LEARNING METHODOLOGY FOR MEDICAL DATA ANALYSIS FOR
PREDICTION OF RISK
G.L. Sravanthi
1
, N. Harika
2
, G. Archana
3
, B. Sundara Leela
4
1
Asst. Professor, CSE Department, Vignan Nirula Institute of Technology & Science for Women, Peda Palakaluru, Guntur, Andhra
Pradesh, India. glsravanthi88@gmail.com
2
Asst. Professor, CSE Department, Vignan Nirula Institute of Technology & Science for Women, Peda Palakaluru, Guntur, Andhra
Pradesh, India. narraharika4@gmail.com
3
Asst. Professor, CSE Department, Vignan Nirula Institute of Technology & Science for Women, Peda Palakaluru, Guntur, Andhra
Pradesh, India. archu.gunakala@gmail.com
4
Asst.Professor, JNTUK, Andhra Pradesh, India. sundaraleela.b@gmail.com
Received: 19.12.2019 Revised: 22.01.2020 Accepted: 24.02.2020
Abstract
Mining data is a nontrivial procedure of finding information from a large volume of data. Such information can be helpful in settling on
significant choices. Medical data show special features including noise coming about because of human just as methodical blunders,
missing qualities and even meager conditions. The nature of data has huge ramifications for the nature of the mining results. Medical
data classification is important to perform preprocessing steps so as to expel or at least lighten a portion of the issues related with
medical data. Clustering is a descriptive-based data mining task. The clustering algorithm is also called as unsupervised learning
algorithm that learns the unlabeled dataset and groups or clusters the instances based on their similarity and builds the clustering model.
Clustering is same as classification in which data is grouped, but in this, groups are not predefined. In clustering, clusters are not
predefined. Classification of different types of clustering is as follows: Hierarchical clustering, Partition clustering, Categorical clustering,
Density based clustering and Grid based clustering. The main intension of the research is to classify the medical data with high accuracy
value. In order to achieve promising results, a novel data classification methods have been designed that utilize a Improved Cluster
Optimal Classifier (ICOC). The proposed method is compared with traditional methods and the results show that the proposed method
performance is better and accurate.
Keywords: Medical Data Classification, Machine Learning Methodologies, Data Mining, Data Classification, Cluster Based Classification.
© 2019 by Advance Scientific Research. This is an open-access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
DOI: http://dx.doi.org/10.31838/jcr.07.04.151
INTRODUCTION
Classification is the way toward grouping a data thing into one of
the predefined classes. Two steps are to be followed in
Classification process [1]. It includes examining the features of a
recently displayed object and appointing to it a predefined class.
Initial a model is constructed portraying a predefined
arrangement of data classes or ideas [2]. Preparing data are
utilized to construct the model. Furthermore, the model is
utilized for classification. Figure 1.6 represents the Schematic
Representation of Classification
Classification is a predictive-based data mining task. In order to
accomplish the classification task, the classification algorithm is
used to learn the dataset and to build the classifier [15]. The
dataset contains a set of features (columns) and instances (rows)
with a target-class attribute which contains the class-label
associated with each instance of the dataset [16]. The unlabeled
data is given to the classifier in order to predict the class-label of
the unlabeled instance [17]. Classification can be performed on
structured or unstructured data [18]. The goal of classification is
to identify the category where new data comes under.
Fig 1: Schematic Classification Representation
A data classifier requires a choice of features that must be
custom fitted independently for different issues [19] [20].
Following component determination, classifier improvement
requires detachment of the data into preparing and test data and
experiences two noteworthy periods of data classifier
development as shown in the Figure 2.