Data Mining Based Analysis of Cancer Risks in Iraq * Ersin Elbasi Muhamad Azhar Abdilatef and ,Zead Mohammed Yosif Department of Computer Engineering College Of Engineering (M.Sc. Student at Cankaya University) Cankaya University University of Mosul Ankara , Turkey Mosul ,Iraq eelbasi@cankaya.edu.tr { c1271504 & c1271506}@student.cankaya.edu.tr * This work is partially supported by NSF Grant #2003168 to H. Simpson and CNSF Grant #9972988 to M. King. Abstract - In this paper we applied six algorithms using Weka (Data Mining Program) over 400 cases of patients with different types of cancers in Mosul Iraq in 2007. After we applied these algorithms in these dataset, we displayed the results and compared the results that found by using these different algorithms. Index Terms - Mosul cancer data set , applying data mining algorithms , Weka data mining . I. INTRODUCTION Since 1980 and until this day , the Iraqi people are exposed to many risks , because of the wars and conflicts that occur on the land of Iraq , millions of tons of bombs fell on Iraqi cities led to killing millions of people and destroyed every things , This wars did not stop at this point , but it had a very negative side effects on the lives of Iraqi citizen , they suffered from shock due to fear and terror in addition to exposure to large amounts of radiation resulting from the radioactive uranium and many chemical materials which were placed in the bombs and missiles that fell on Iraqi cities, which led to the emergence of millions of cases of cancer in Iraq, in addition to many diseases and congenital malformations in children. We had these dataset from a research handled with 400 patients in one of Mosul hospitals in 2007 [8],it is include a various types of cancers for many different ages different educational levels, in addition to the types of treatments that have been used for treatment purposes. We applied a number of algorithms to a Weka (Data Mining Program), and then compared the results based on these used algorithms. II. SIMPLE DEFINITION OF DATA MINING Data mining is a field at the intersection of computer science and statistics, is the process that attempts to discover patterns in large data sets. It utilizes methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating [4] . Data sets that we used in our projects are [Fig. 1], about the Cancer cases in Mosul Iraq in 2007, these cases are for many different ages and status , also the types of treatments that used for them. In the following, the Definition of Cancer and many of cancer types those we will behave within our project. Cancer: known medically as a malignant neoplasm, is a broad group of various diseases, all involving unregulated cell growth. In cancer, cells divide and grow uncontrollably, forming malignant tumors, and invade nearby parts of the body. The cancer may also spread to more distant parts of the body through the lymphatic system or bloodstream. Not all tumors are cancerous. Benign tumors do not grow uncontrollably, do not invade neighbouring tissues, and do not spread throughout the body. There are over 200 different known cancers that afflict humans [5]. Fig. 1 Data Set. Determining what causes cancer is complex. Many things are known to increase the risk of cancer, including tobacco use, certain infections, radiation, lack of physical activity, obesity, Figure (1)