Tiwari Divyansh et al.; International Journal of Advance Research, Ideas and Innovations in Technology © 2019, www.IJARIIT.com All Rights Reserved Page |1968 ISSN: 2454-132X Impact factor: 4.295 (Volume 5, Issue 2) Available online at: www.ijariit.com Virtual doctor Divyansh Tiwari tiwari.divyansh007@gmail.com IMS Engineering College, Ghaziabad, Uttar Pradesh Arpit Kumar arpit.gupta151@gmail.com IMS Engineering College, Ghaziabad, Uttar Pradesh Ayush Tripathi ayushtripathi1024@gmail.com IMS Engineering College, Ghaziabad, Uttar Pradesh ABSTRACT The healthcare environment is still ‘Information Rich’ but ‘Knowledge Poor’. There is a wealth of data available within the health care systems. However, there is a lack of effective analysis tools to discover hidden relationships in data. The aim of this work is to design a GUI based Interface to enter the patient symptoms and predict which disease the patient is having using various machine learning algorithms. The prediction is performed from mining the patient’s symptom data or data repository. This paper has analyzed prediction systems for disease using more number of input attributes. The system uses medical terms such as fever, pain, cholesterol-like attributes to predict the likelihood of a patient getting a particular disease. Until now, over 100 attributes are used for prediction. The data mining classification techniques, namely Decision Trees, Naive Bayes, and Random Forest are analyzed on disease database. The performance of these techniques is compared, based on accuracy. KeywordsPredictive analysis, Data mining Machine Learning 1. INTRODUCTION Data mining is the method for finding unknown values from an enormous amount of data. As the patient's population increases the medical databases also increasing every day. The transactions and investigation of these medical data are difficult without the computer-based analysis system. The computer-based analysis system indicates the mechanized medical diagnosis system. This mechanized diagnosis system supports the medical practitioner to make a good decision in treatment and disease. Data mining is the huge platform for the doctors to handle the huge amount of patient’s datasets in many ways such as making sense of complex diagnostic tests, interpreting previous results, and combining the dissimilar data together. In today's computerized world considering automatic and dynamic requirements healthcare system should be more efficient by predicting the disease and providing appropriate medications through user-friendly mobile applications. This study aims mainly for the health concerns and the ones who want to be their own Doctor. It is an interactive service for users who wants to know about what health issues they are going through as per the symptoms. It is easy to access and use for searching medicines for the diseases predicted. 2. LITERATURE SURVEY 2.1 Comparative analysis In the paper “Disease Prediction System using data mining techniques” the author has discussed the data mining techniques lik e association rule mining, classification, clustering to analyze the different kinds of heart-based problems. The database used contain a collection of records, each with a single class label, a classifier performs a brief and clear definition for each class that can be used to classify successive records. The data classification is based on MAFIA algorithms which result in accuracy, the data is estimated using entropy-based cross-validations and partition techniques and the results are compared. C4.5 algorithm is used as the training algorithm to show the rank of heart attack with the decision tree. The heart disease database is clustered using the K-means clustering algorithm, which will remove the data applicable to a heart attack from the database. Some limitations are faced by the system like, the time complexity is more due to DFS traversal, C4.5- Time complexity increases while searching for insignificant branches and lastly no precautions are defined. In the paper “A study on data mining prediction techniques in the healthcare sector” [2] the fields which discussed are, Knowledge Discovery Process (KDD) is the process of changing the low-level data into high-level knowledge. Hence, KDD refers to the nontrivial removal of implicit, previously unknown and potentially useful information from data in databases. The Knowledge Discovery in Databases process comprises of a few steps leading from raw data collections to some form of new information. The iterative process consists of the following steps: Data cleaning, Data integration, Data selection, Data transformation, Data mining, Pattern evaluation, Knowledge. Healthcare data mining prediction based on data mining techniques are as follows: Neural network, Bayesian Classifiers, Decision tree, Support Vector Machine. The paper states the comparative study of different healthcare predictions, Study of data mining techniques and tools for the prediction of heart disease, various cancers, diabetes, eye disease and dermatological conditions. Data mining based prediction system reduces the human effects and