IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-ISSN: 2278-2834,p- ISSN: 2278-8735.Volume 9, Issue 1, Ver. III (Jan. 2014), PP 69-75 www.iosrjournals.org www.iosrjournals.org 69 | Page Lung Cancer detection and Classification by using Machine Learning & Multinomial Bayesian Mr. Sandeep A. Dwivedi 1 , Mr. R. P. Borse 1 , Mr. Anil M. Yametkar 2 1 Department of E & TC, SAE, Kondhawa, Pune University. 2 Department of E & TC, ARMEIT, Mumbai University. Abstract: Image Processing is a technique to enhance raw images received from cameras/sensors placed on satellites, space probes and aircrafts or pictures taken in normal day-to-day life for various applications. Various techniques have been developed in Image processing during the last four to five decades. Most of the techniques are developed for enhancing images obtained from unmanned spacecrafts, space probes and military reconnaissance flights. Image Processing systems are becoming popular due to easy availability of powerful personnel computers, large size memory devices, graphics software etc. Medical image segmentation & classification play an important role in medical research field. The patient CT lung images are classified into normal and abnormal category. Then, the abnormal images are subjected to segmentation to view the tumor portion. Classification depends on the features extracted from the images. We mainly are concentrating on feature extraction stage to yield better classification performance. Texture based features such as GLCM (Gray Level Co-occurrence Matrix) features play an important role in medical image analysis. Totally 12 different statistical features were extracted. To select the discriminative features among them we use sequential forward selection algorithm. Afterwards we prefer multinomial multivariate Bayesian for the classification stage. Classifier performance will be analyses further. The effectiveness of the modified weighted FCM algorithm in terms of computational rate is improved by modifying the cluster center and membership value updating criterion. Objective of this paper is that To achieve a perfect classification by multivariate multinomial Bayesian Index Terms: Histogram Equalization, Image segmentation, feature extraction, neural network classifier, fuzzy c-means algorithm. I. INTRODUCTION The early detection of lung cancer is a challenging problem, due to the structure of the cancer cells, where most of the cells are overlapped with each other. Classification is very important part of digital image analysis. It is a computational Procedure that sort images into groups according to their similarities. In this paper Histogram Equalization is used for preprocessing of the images and feature extraction process and neural network classifier to check the state of a patient in its early stage whether it is normal or abnormal. The manual analysis of the sputum samples is time consuming, inaccurate and requires intensive trained person to avoid diagnostic errors. The segmentation results will be used as a base for a Computer Aided Diagnosis (CAD) system for early detection of lung cancer which will improve the chances of survival for the patient. However, the extreme variation in the gray level and the relative contrast among the images make the segmentation results less accurate, thus we applied a thresholding technique as a pre-processing step in all images to extract the nuclei and cytoplasm regions, because most of the quantitative procedures are based on the nuclear feature.. Experimental analysis is made with dataset to evaluate the performance of the different classifiers. The performance is based on the correct and incorrect classification of the classifier. All experiments are conducted in WEKA data mining tool. II. PROCEDURE OVERVIEW A. Proposed Methodology In our proposed method, we have to preprocess the given test image for reducing noise and to enhance the contrast. Afterwards, texture features (GLCM) will be extracted from it. In feature extraction stage, statistical measurements are calculated from the gray level co-occurrence matrix for different directions and distances. Among the various features extracted. We have to select the distinct features that will be utilized for classification purpose. For the selection of features SFS (Sequential Forward Selection) is used. Kernelised Bayesian is used to classify whether the test image comes under normal and abnormal.