International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8 Issue-9, July 2019 271 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Retrieval Number: H7178068819/19©BEIESP DOI:10.35940/ijitee.H7178.078919 Abstract: Presence of nodules in lung images can be an indication of multiple types of diseases such as tumor, cancer, etc. Detection of nodules for lung images is a ubiquitous task, which requires lot of computations for pre-processing, tissue detection, removal of non-nodule regions and finally nodule segmentation. In this paper we propose a multi-threshold descriptor based algorithm which applies multiple levels of thresholds to the image, in order to detect and remove all the non-nodule regions and finally uses KNN algorithm in order to classify the input image into benign or malignant. The training and testing sets are carefully selected in order to obtain optimal accuracy for the system. In this work, we obtain 82.65% accuracy, sensitivity and specificity is 85.71% and 80.35% respectively for classification of the input medical image. Index Terms: classification, KNN algorithm, multi-threshold, nodule. I. INTRODUCTION Lung cancer is a dangerous disease, which is also known as lung carcinoma and it’s a malignant lung tumor which is identified by the extremely unrestrained cell growth arising in the lung tissues. Lung cancer is one of the risks for the life of human throughout the world. There is a high risk of death due to the lung cancer as compared to the other kinds of malignant growth (cancer). Lung cancer is standout amongst the most dangerous cancer in the whole universe, with the least survival count after the determination, with a continuous growth in the count of mortality every passing year. If the disease can be identified at the beginning periods, then the survival rate of an individual is high [1]. Generally, in lung cancer there are mainly 4 stage; 1 to 4. Staging of cancer relies on tumor size and lymph node position. CT scan is more powerful than normal chest x-ray in identifying and treating the malignant growth of lung. An expected 85% of lung malignant growth cases in males and 75% in females are brought about by smoking [2]. An expected 228,150 new cases of lung disease will be analyzed in the US in 2019. An expected 142,670 deaths from lung malignant growth will occurs in 2019 [1]. Therefore, diagnosis of a disease in the earlier stage is very important. The aim of this research paper is to detect the cancerous lung nodule and must provide a good accurately evaluated outcomes by applying enhancement, segmentation and classification methods. Revised Manuscript Received on July 05, 2019 Sakshi Wasnik, Electronics Engineering Department, Shri. Ramdeobaba College of Engineering and Management, Nagpur, India Pallavi Parlewar, Electronics and Communication Engineering Department, Shri. Ramdeobaba College of Engineering and Management, Nagpur, India Dr. Prashant Nimbalkar, Radiologer, Precision Scan and Research Centre, Nagpur, India. II. LITERATURE REVIEW Ashwin S et al. [3] proposed a two stage CAD system, where the first and foremost step is preprocessing and afterwards segmenting the cancerous nodule region. And in the further stage they have used ANN machine learning technique which is being trained by using BFGS algorithm. Adaptive median filtering is being utilized to eliminate the noise present in the image. For enhancing the CT image, contrast limited adaptive histogram equalization (CLAHE) technique is used and for segmentation multilevel thresholding technique is adopted and thus they have achieved the accuracy of 96.7%, sensitivity of 92.1%, specificity of 94.30%. Imran et al. showed a method for segmenting of lung region from CT scan images [4]. They employed Wavelet Packet Frame (WPF) technique to acquire spatial frequency representations and apparently applied k-Means clustering for better segmentation of lung tissues. This proved that the technique is powerful and is able to effectively segment lung regions from numerous images from different scans. Azar et al. [5] suggested decision support tool for the identification of breast cancer nodules on the basis of three kinds of classifier viz. Single decision tree (SDT), Boosted decision tree (BDT), Decision tree forest. It is found that BDT performance is good as compared to SDT having the accuracy of 98.83% & 97.07% respectively. Li-Hong Xiao et al used Random forest algorithm for the prediction of prostate cancer. Here they combine transrectal ultrasound outputs, age, and serum PSA levels to predict prostate cancer [6]. This model is good for deciding whether invasive biopsy is necessary or not and gives us a more accurate results. The only disadvantage with this method is that it does not take into account all factors that may be useful for prostate cancer diagnosis, such as family history of prostate cancer, digital rectal exam results, and Gleason score. This method gives us an accuracy of 83.10%, sensitivity and specificity of 65.64% and 93.83% respectively, positive predictive value of 86.72% and negative predictive value of 81.64%. Yeh et al. [7] introduced decision-tree prototype as one of the ideal prototype for particularly brain disease with comparision to Bayesian-classifier and back-propogation neural network and it got a tremendously good accuracy of 99.59%. Fan et al. [8] presented a model which is based on hybrid reasoning and fuzzy decision tree (BFDT) for detection of liver disease with an accuracy of 81.6% which is highest among various other models. Ozcift [9] used best first search random forest algorithm and found classification accuracy of 98.9%. Nguyen et al. [10] utilized random forest classification algorithm with feature selection for diagnosis of breast cancer and achieved a good classification accuracy Nodule detection in lung using multi-threshold segmentation Sakshi Wasnik, Pallavi Parlewar, Prashant Nimbalkar