International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-8 Issue-6, March 2020 241 Retrieval Number: F7124038620/2020©BEIESP DOI:10.35940/ijrte.F7124.038620 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Abstract: Lung cancer has been one of the deadliest diseases in today’s decades. It has become one of the causes of death in both man and woman. There are various reasons for which lung cancer occurs but classification of tumor and predicting it in the right stage is the most important part. This paper focused on the numerous approaches has been derived for lung cancer detection from different literature survey to advance the ability of detection of cancer. Digital image processing and data mining both are equally important because for prediction either image dataset or statistical dataset is used so for pre-processing the image dataset digital image processing is applied for statistical dataset data mining is applied. After pre-processing, segmentation and feature extraction we apply various machine learning algorithm for the prediction of lung cancer. So first we have provided a sketch of Machine learning and then various fields like in image data or statistical data where machine learning has been used for classification. Once the classification is done confusion matrix is generated for calculating accuracy, sensitivity, precision, these method is used to measure the rate of accuracy of the proposed model. Keywords: Lung Cancer, Machine learning and its technique, Digital image processing I. INTRODUCTION The rapid growth of machine learning is very interesting for many people due to its numerous applications in various areas like it can be used for fraud detection, computer vision, bioinformatics, medical image diagnosis etc. This is used for prediction of cancer based on the medical reports like CT scan, X-Ray, and MRI etc, and has been proven that due to various machine learning technique it has become easier for the doctor to predict disease at right stage. Cancer is a leading cause of death globally and by 2018 it has been estimated as 9.8 million deaths and this estimation has been provided by world health organization, and the most common cancer is lung cancer, and death rate due to lung cancer is more as compared to other all type of cancer [1]. Lung cancer is one of the leading causes of cancer death in both men and women [2]. There are various reason for lung cancer like smoking, explorer to radon gas etc but it is not necessary that the person who smoke will only suffer from lung cancer, it can also occur due to secondhand smoking. The treatment therapy monitoring and the lung nodule Revised Manuscript Received on February 14, 2020. * Correspondence Author Nikita Banerjee*, Department of Computer Science and Engineering, Collage of Engineering and Technology, Bhubaneswar, India. E-mail: nikitabanerjee1994@gmail.com Subhalaxmi Das, Department of Computer Science and Engineering, Collage of Engineering and Technology, Bhubaneswar, India. E-mail: sdascse@cet.edu.in analysis by using the computed tomography (CT) medical images that are having useful strategies to diagnosis the lung cancer early and also to monitor the severity [3]. This paper consist of various machine learning techniques used for the prediction of cancer in both image data that is CT scan report through which we can predict the location of tumor or the size of tumor and CSV file which contain the data like age, gender smoking rate etc. Paper has been dived into five sections. Section 1 consist of Enabling Terminology , section 2 Machine learning, Section 3 machine learning algorithm used for prediction, Section 4 and 5 consist of comparison Section 6 consist of discussion followed by conclusion and future scope. II. ENABLING TERMINOLOGY Pre-processing means cleaning the data so that it can be noise free and it would yield more accuracy. As cancer dataset can be an image data or a numerical data which will be in CSV (Comma separated values) format, and both the dataset has different process for pre-processing for image data we can used digital image processing and for clinical data we can use data mining technique. And after pre processing of data we apply machine learning for classification of the class and calculate accuracy. A. Image Pre-Processing Using Digital Image Processing Digital image processing is the technique where we can manipulate or perform some action in order to extract some useful information from the image. It starts from image pre-processing where we enhance the image by using various technique like histogram process, log transformation, etc then followed by image restoration is applied on the enhanced image by adding some noise like Gaussian noise, salt and pepper noise and based on the noise individual noise we add filter to remove the noise filter like mean filter, median filter etc, noise is added in image to get more clear picture. Once the noise is removed color conversion is adapted to convert the image from red, green, blue (RGB) to grey level or from RGB to HSV (hue, saturation, value).After the completion of image conversation image segmentation is enforced, the work of image segmentation is to segment the image into constituent parts, there are various techniques for image segmentation like edge detection, point detection, region based detection etc. Image segmentation is very important in digital image processing because it keeps only that part which is needed. After image segmentation is executed it is proceed by feature extraction so feature extraction can be defined as the process by which we can reduce the dimensionality by which a set of the raw data is reduced to more manageable group for process there are various process of feature extraction like based on region, Machine Learning Techniques for Prediction of Lung Cancer Nikita Banerjee, Subhalaxmi Das