Contributions of KDD to the Knowledge Management Process: a Case Study Paulo de Tarso Costa de Sousa Superior Electoral Court SEPN 514 Ed Anexo II 55-61-340-3863 ptarso@tse.gov.br Hércules Antonio do Prado Brazilian Agricultural Research Corporation Center for Agricultural Research on Savannah, Caixa Postal 08.223 – CEP: 70770-000 - Planaltina, DF, Brasil hercules@cpac.embrapa. br Eduardo Amadeu Moresi Catholic University of Brasília Graduate Program in Management of Knowledge and Information Technology CEP: 70.790-160 - Brasília, DF, Brasil moresi@ucb.br Marcelo Ladeira University of Brasília Campus Universitário Darcy Ribeiro - Asa Norte CEP: 70910–900 – Brasília – DF - Brasil mladeira@cic.unb.br ABSTRACT Knowledge Discovery in Databases (KDD), as any organizational process, is carried out beneath a Knowledge Management (KM) model adopted (even informally) by a corporation. KDD is grossly described in three steps: pre-processing, data mining, and post-processing. The latter is mainly related to the task of transforming in knowledge the patterns issued in the data mining step. On the other hand, KM comprises the following phases, in which knowledge is the subject of the actions: identification of abilities, acquisition, selection and validation, organization and storage, sharing, application, and creation. Although there are many overlaps between KDD and KM, one of them is broadly recognized: the point in which knowledge arises. This paper concerns a study aimed at clarifying relations between the overlapping areas of KDD and knowledge creation, in KM. The work is conducted by means of a case study using the data from the Electoral Court of the Federal District (ECFD), Brazil. The study was developed over a 1.717.000-citizens data set from which data mining models were built by applying algorithms from Weka. It was observed that, although the importance of Information Technology is well recognized in the KM realm, the techniques of KDD deserve a special place in the knowledge creation phase of KM. Moreover, beyond the overlap of post- processing and knowledge creation, other steps of KDD can contribute significantly to KM. An example is the fact that one important decision taken from the ECFD board was taken on the basis of a knowledge acquired from the pre-processing step of KDD. Keywords data mining, knowledge management, knowledge discovery in databases. 1. INTRODUCTION AND MOTIVATION Practitioners on Knowledge Management (KM) usually adopt some model to guide the process and this model describes the activity of knowledge creation. On the other hand, the Knowledge Discovery in Databases (KDD) practice has been supported by methods like CRISP-DM [1], that include an interpretation activity. The clearest overlap of KDD and KM is the knowledge creation phase in KM and the interpretation task in KDD. In this paper we explore relations between KDD and KM phases by means of a case study in order to set a scene in which we show the specific importance of KDD to KM and try to make smoother the road of integration of both areas. It is shown how KDD can generate relevant input to KM not only from the data mining and post-processing tasks, but also from the pre-processing activities. Observations taken during tasks like problem definition, data acquisition and cleaning, and data and algorithm engineering can lead the analyst to be aware of some organizational failures. These failures, if not properly analyzed in a KM context, cannot receive the deserved importance, keeping the organization from benefiting from preventive or corrective actions. The discussion flows on the basis of an application on the Brazilian election domain, with focus on the organization data rather than in voting problems. CRISP-DM [1] method was applied to conduct the process while Weka [17] suite was adopted to build classification and clustering models. Specifically, a k-means algorithm was used for clustering and decision trees for classification. The reference model we used for KM is the generic one proposed by Stollenwerk [14] shown in Figure 1. Figure 1 – Generic KM model of Stollenwerk This model considers four aspects that compose the environment of KM: leadership, culture, technology, measures and compensations. Inside this environment, seven activities related to the generation of organizational knowledge are proposed. They CLEI ELECTRONIC JOURNAL, VOLUME 7, NUMBER 1, PAPER 2, JUNE 2004