Mining Healthcare Data: The Case of an Endoscopic Thoracic Sympathectomy Dataset MARIBEL YASMINA SANTOS, DIANA GONÇALVES Algoritmi Research Centre University of Minho, Guimarães, PORTUGAL maribel@dsi.uminho.pt, pg13247@uminho.pt JORGE CRUZ Medicine Faculty, University of Lisbon, Lisboa, PORTUGAL costacruzjorge@gmail.com Abstract: - The process of knowledge discovery in databases aims at the discovery of associations within data in a dataset. Data Mining is a central step of this process corresponding to the application of algorithms for identifying patterns in data. This paper presents the particular case of analysis of a dataset containing data associated with 227 patients submitted to an endoscopic thoracic sympathectomy, a treatment for primary palmar hyperhidrosis. Primary hyperhidrosis is characterized by an excessive sweating that appears as a consequence of a disorder of the sympathetic autonomous nervous system. The results achieved show an overall improvement of the patients’ quality of life, mainly associated with their emotional state. Key-Words: - Knowledge discovery in databases, data mining, decision trees, primary hyperhidrosis, endoscopic thoracic sympathectomy 1 Introduction Primary hyperhidrosis is a disorder of the sympathetic autonomous nervous system that affects around 1% of the global population [1]. It is characterized by an excessive sweating of the face, palms, armpits and foots, either in all of these locals or in some of them. This excessive sweating provokes several problems to the individual, who sees his/her life completely influenced by this disorder. Endoscopic upper-thoracic sympathectomy has been considered the treatment of choice for primary palmar hyperhidrosis [2]. This surgery is a minimal invasive procedure of thoracic sympathetic blockage and consists of the bilateral ablation of the second and third thoracic sympathetic ganglions, affecting the sympathetic nervous outflow to the arms and elsewhere [3-4]. Being a definitive treatment for this disorder, the results of the surgery have revealed a high degree of patient satisfaction [2]. To show the increase of the quality of life and the incidence of complications and side effects after the surgery, this paper shows the analysis of data collected from 227 patients. Although several studies have been conducted in order to evaluate the improvement in the patients’ health condition [1-6], the analysis presented in this paper exploits a new perspective in data analysis, using data mining algorithms to analyze the available data. After the analysis of the collected data, it was verified an improvement of the overall health condition of the patients, with a significant change in their emotional state. The incidence of compensatory hyperhidrosis was analyzed as it is one of the major complains of the patients after the surgery. These results are presented later on in this paper. This paper is organized as follows. Section 2 presents an overall overview of the knowledge discovery process and the main steps associated with it. Section 3 gives an overall overview of the collected data and the questionnaire used in the data collection process. Section 4 presents the results achieved using the Clementine Data Mining System for data analysis. Section 5 concludes with some remarks about the work undertaken. 2 Knowledge Discovery in Databases Knowledge Discovery in Databases (KDD) is a process that aims at the discovery of associations within datasets. Data Mining is a central step of this process. It corresponds to the application of algorithms for identifying patterns from data without the additional steps of the knowledge discovery RECENT ADVANCES in ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING and DATA BASES ISSN: 1790-5109 333 ISBN: 978-960-474-154-0