Vol.:(0123456789) SN Computer Science (2020) 1:170 https://doi.org/10.1007/s42979-020-0097-6 SN Computer Science ORIGINAL RESEARCH Heart Disease Prediction Using CNN Algorithm VirenViraj Shankar 1  · Varun Kumar 1  · Umesh Devagade 1  · Vinay Karanth 1  · K. Rohitaksha 1 Published online: 15 May 2020 © Springer Nature Singapore Pte Ltd 2020 Abstract In this paper, we aim to predict accuracy, whether the individual is at risk of a heart disease. This prediction will be done by applying machine learning algorithms on training data that we provide. Once the person enters the information that is requested, the algorithm is applied and the result is generated. Obviously, the accuracy is expected to decrease when the medical data itself are incomplete. We implement the prediction model over real-life hospital data. We propose to use con- volutional neural network algorithm as a disease risk prediction algorithm using structured and perhaps even on unstructured patient data. The accuracy obtained using the developed model ranges between 85 and 88%. We have proposed further by applying other machine learning algorithms over the training data to predict the risk of diseases, comparing their accuracies so that we can deduce the most accurate one. Attributes can also be modifed in an attempt to improve the accuracy further. Keywords Machine learning · Big data analytics · Deep learning · Medical applications · Convolutional neural network Introduction It is reported that 50% of Americans sufer from at least one chronic disease. Unsurprisingly, this results in 80% of US healthcare fee being spent on chronic disease treatment. With the raise in the living standards, the efect of these dis- eases also increases. The USA as a whole has spent almost $2.7 trillion per annum on respective treatments. The USA is not the only country where large sums are spent treating chronic diseases. In China, for example, most people die because of chronic diseases, as reported, this accounts for more than 85% of all deaths in the world’s most populated country. Clearly, it is essential that early diagnosis and treat- ment are essential, not just to save costs, but also to save human life and improve quality of life. Chen et al. proposed a healthcare system using smart clothing for sustainable health monitoring. Qiu et al. had thoroughly studied the heterogeneous systems and achieved the best results for cost minimization on tree and simple path cases for heterogeneous systems. Patients’ statistical infor- mation, test results and disease history are recorded in the EHR, enabling us to identify potential data-centric solutions to reduce the costs of medical case studies. With the development of big data analytics technology, more attention has been paid to disease prediction from the perspective of big data analysis; various researches have been conducted by selecting the characteristics automati- cally from a large number of data to improve the accuracy of risk classifcation, rather than the previously selected characteristics. To solve these problems, the structured and unstructured data can be combined in healthcare to assess the risk of disease. How Model Works? Figure 1 depicts the various steps carried out during the prediction of heart disease. 1. It starts with the data collection; here in this step, difer- ent types of data mainly structured, semi-structured or unstructured can be collected from various sources like hospital, etc. 2. Once the data are collected, the obtained data are frst cleaned to remove missing values and to bring under same level of granularity, and then, the cleaned data are classifed into training data and test dataset. This article is part of the topical collection “Advances in Computational Intelligence, Paradigms and Applications” guest edited by Young Lee and S. Meenakshi Sundaram”. * Varun Kumar varunkumar8156@gmail.com 1 Department of Computer Science and Engineering, JSS Academy of Technical Education, Visveswaraya Technological University, Bangalore 56006, India