Discovering Dependencies in Medical Data by Visualisation Jacek Dryl + , Halina Kwasnicka * , Urszula Markowska-Kaczmar * , Rafal Matkowski + , Paweł Mikołajczyk * , Jacek Tomasiak * + Medical University of Wrocław *Wroclaw University of Technology, Wyb.Wyspianskiego 27, 50-270 Wrocław, Poland Kaczmar@ci.pwr.wroc.pl Abstract In the paper the visualization of medical data sets by Samonn’s mapping is tested. This mapping performs the projection of multidimentional data into the small size space. For the purpose of visualization the size is equal 2. The obtained results were verified with the results of statistical approach. The application of Samonn’s mapping seems to be promising results for discovering the new regularities in the data and for producing the prognosis of the disease free survival for new patients, as well. In the future development of application the implementation of neural network is planned, which will perform the transformation of new data in an automatically way and it will help to find the values of missing data. 1. Introduction Breast cancer and carcinoma of the cervix uteri are the most frequently diagnosed women’s cancer in Poland. Despite of advances in diagnosis and treatment they are also the leading cause of cancer death. Observational studies show that the disease diagnosed as breast cancer includes at least two entities that are, as yet, not reliably distinguished – one with a rapidly fatal outcome and the other with an outcome only slightly different from that of a group of women of similar ages without evidence of disease. If we could identify, which tumors were particularly aggressive, those patients might want to consider more intensive therapies. This is a reason why many experimental studies are focused on development of new prognostic and predictive factors. Owing to the interdisciplinary of our research group, which consists of medical experts and people from computer science discipline, we are able to understand more easily the problem and the profits offered by some heuristic and computational methods. In the presented research we have tried to investigate whether visualization can be helpful in this problem. If yes, can it be applied alone or only as complementary tool for other methods. In the previous studies data collected in Lower Silesian Oncology Center about the patients with breast cancer were studied with statistical methods offered by package Statistica (StatSoft, Inc) and R package (Bell Laboratories). [9],[10]. Some of them will be presented here in order to compare the results obtained by statistical approach and visualization on the same data. Some experiments with visualization were also made with carcinoma of the cervix uteri patients because this group of patients was much larger. This paper is organized as follows. In section 2 two experimental data set are presented, on the base which we evaluated Samonn’s mapping as the method of visualization. It will be described in details in section 4. In section 5 the experimental results are shown and compared with those obtained by statistical approach. Future plans and conclusion are presented at the end of paper. 2. The experimental data In our study we have used two different data files. The first comes from 5-year observation of 527 patients with primary cancer of the cervix uteri treated in Lover Silesian Cancer Center in 1996, 1997 and 1998 [10]. The clinical and pathological data available on these patients include: date of birth and patients age, FIGO stage of the disease (according to FIGO Staging, 1994), tumor size, histological type of the tumor, degree of differentiation of the tumor, interval between diagnosis and first treatment (both dates), type of surgical treatment, type of performed radiotherapy, duration of radiotherapy, assessment of response to treatment, date of end of