IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.10, October 2009 153 Manuscript received October 5, 2009 Manuscript revised October 20, 2009 A Model for the Management of the Knowledge applied to the Software Development Process Vieira, Sandro C. Banco do Brasil Department of Information Technology Ed. Sede IV – Brasilia / DF Weigang, Li University of Brasilia - UnB Department of Computer Science Darcy Ribeiro Campus Ladeira, Marcelo University of Brasilia - UnB Department of Computer Science Darcy Ribeiro Campus Abstract This paper describes a research which was conducted in order to obtain a model of the Text Mining and Case Based Reasoning designed for aiding the knowledge management for the software development process and improve software reuse. To achieve this goal we propose a model to determine the similarity between software development requests which are composed both by structured elements, described by codes or symbols, and non- structured elements described in natural language. An algorithm about the measurement of this similarity was proposed and the model was constructed as a complete information system. The application was realized as a case study in a finance organization. The results collected illustrate the efficiency of the system to improve software reuse and to avoid duplicated efforts in solution design. The paper shows the potential of the proposed system in the construction of detail solutions and characteristics for this organization. Categories and Subject Descriptors D.2.0 [Software engineering]: General General Terms Algorithms, Management, Measurement, Documentation. Keywords Text mining, Case based reasoning, knowledge management, software development process, similarity measurement. 1. Introduction Software development, although not being the core for the majority of the enterprises, is so linked to business that a significant number of the organizations dedicate part of their resources to the activities related with the technology area. This situation is a result of the needing to develop products with an ever shorter life cycle, in addition to the requirement for strict controls aimed at maintaining the competitiveness before the rival companies. In major companies, this reality has been experienced for many years, resulting that a considerable part of the applications currently in use originate from projects that were developed a long time ago. This situation is further aggravated by the lack of documentation pertinent to the applications in use, on account of the large number of interventions (maintenance) that are conducted in situations of emergency, and by the utilization of languages which do not provide the developer with the possibility of controlling and managing the evolution of the applications. A significant part of the main systems utilized by these organizations suffers from this problem; they are usually oldies systems, with a high maintenance rate, inadequate or non-existent documentation, and the frequent need to interact with other systems of a different concept. A big part of these systems is written using structured language, and are executed everyday in mainframes installed in data processing centers of immense capacity. Their execution occurs in routines normally called Batch, on account of processing in lots. Also, it is common for these systems to possess interfaces based on the standard established by the 3270 terminals (with evolutions, such as the use of colors) but restricted to the presentation of texts with an area formed by 25 lines and 80 columns. During the 90’s, many computer science specilists said that the mainframes would be quickly replaced with smaller computers whose processing capacity has been continuously increasing since they were launched. However, if on the one hand the processing capacity has been increasing day by day, the same occurs with the demand for processing; the volume of transactions processed each day has increased in the same proportion, notably in companies of the financial branch – especially banks and credit card managers -, and, consequently the mainframes are still the choice of a great part of these institutions. This phenomenon is justified by the many IT area specialists and executives’ conviction that the performance of the this, executed in mainframes has not been caught up with in other platforms. Besides, the heavy investments that would have to be made to migrate the billions code lines currently estimated to be in production, and the risk associated with the replacement of