International Journal of Systemics, Cybernetics and Informatics (ISSN 0973-4864) 24 Improved Genetic-Fuzzy System For Breast Cancer Diagnosis P. Ganesh Kumar 1 and D. Devaraj 2 1 MS Research Scholar, Electrical and Electronics Engineering 2 Sr.Professor and Head, Electrical and Electronics Engineering Kalasalingam University, Krishnankoil-626190, Tamil Nadu, India Tel:04563-289042, E-mail:pganeshkumar_ms@yahoo.co.in Abstract Breast cancer diagnosis is an important real world medical problem. Fuzzy Rule Based System (FRBS) has been successfully applied to many medical diagnosis problems. An important issue in the design of FRBS is the formation of fuzzy if-then rules and membership functions. This paper presents a Improved Genetic Algorithm (IGA) approach to obtain the optimal rule set and the membership function. Advanced genetic operators are applied to improve the performance of the GA in designing the fuzzy classifier. The performance of the proposed approach is demonstrated using Wisconsin breast cancer data available in the UCI machine learning repository. From the simulation study, it is found that the proposed IGFRBS produces a fuzzy diagnostic system, which has minimum number of rules and whose classification accuracy is better than the results reported in the literature. Keywords: Fuzzy Logic, if-then rules, membership function, Genetic Algorithm, Breast cancer diagnosis. 1. Introduction Diagnosis of disease is a major class of problems in medical science, which involves conducting various tests upon the patient. Even though several tests were conducted, it is difficult for the medical expert to arrive at the final diagnosis. During the past decades, there arises a need for a computerized diagnostic tool [1] that help the physicians in making decisions automatically from the data related to the disease. A prime target for such computerized tools is in the domain cancer diagnosis. Breast cancer [2] is the most common cancer for woman in many countries excluding skin cancer. In general, breast cancer diagnosis is concerned with finding whether the patient under consideration exhibits the symptoms of a benign case, or whether her case is a malignant one. Most breast cancers are detected as a lump/mass on the breast, or through self examination or mammography [3]. Screening mammography is the tool available for detecting cancerous lesions before clinical symptoms appear [4]. Fine needle aspiration (FNA) [5] of breast masses is a cost- effective, non-traumatic, and mostly non-invasive diagnostic test that obtains information needed to evaluate malignancy. The Wisconsin breast cancer diagnosis (WBCD) database [6] is the result of the efforts made at the university of Wisconsin Hospital for diagnosing breast masses. In [7], linear programming techniques were proposed for diagnosis breast cancer using this database. But the solution produced by them lacks in understandability, i.e. diagnostic decisions are essentially black boxes, with no explanation as to how they were attained. A number of research works have been carried out for extracting Boolean rules from neural networks [8, 9]. Even though the results produced by them are encouraging, the Boolean rules obtained are not capable of furnishing the user with a measure of confidence for the decisions made. However with Fuzzy Logic [10], a set of fuzzy if- then rules and membership function can be used to define the benign and malignant case of a breast cancer and a Fuzzy Inference Algorithm can be applied over such rules for diagnosing it. Accuracy maximization and complexity minimization are the two main goals in the design of fuzzy rule-based diagnostic system. In general the rules and membership function are formed from the experience of the human experts. With an increasing number of variables, the possible number of rules increases exponentially, which makes it difficult for experts to define a complete rule set for good system performance. Data-driven approaches have been proposed for developing the fuzzy system from numerical data without domain experts [11, 12]. But they are very weak in self learning and determining the required number of fuzzy if-then rules. The design of a fuzzy classifier system can be formulated as a search problem in high dimensional space where each point represents a rule set, membership function and the corresponding system behavior. Given some performance criteria, the performance of the system forms a hyper surface in the space. Developing the optimal fuzzy system is equivalent to finding the optimal location of this hyper surface. This makes Evolutionary Algorithms such as genetic algorithm [13] a better candidate for fuzzy classifier design. Genetic Algorithms are search algorithms based on the mechanics of natural genetics. In [14], genetic algorithm based fuzzy classifier was proposed in which binary strings are used to represent the solution variables and basic genetic operators were applied. In [15], fuzzy-genetic approach was proposed in which integer strings are used to represent the solution variables. This paper presents an improved genetic Copyright © 2008 Paper Identification Number: JUL08-03 This peer-reviewed journal paper has been published by the Pentagram Research Centre (P) Limited. Responsibility of contents of this paper rests upon the authors and not upon Pentagram Research Centre (P) Limited. Copies can be obtained from the company for a cost.