VOL. 2, NO. 11, October 2011 ISSN 2079-8407 Journal of Emerging Trends in Computing and Information Sciences ©2009-2011 CIS Journal. All rights reserved. http://www.cisjournal.org 598 A Study of Mining Software Engineering Data and Software Testing T.Murali Krishna 1 , Devara Vasumathi 2 1. Lecturer, Department of Computer Science, College of Engineering & Technology, Jimma University, Jimma, Ethiopia 2. Associate. Professor, Department of Computer Science &Engineering, College of Engineering & Technology, Jawaharlal Nehru Technological University, Hyderabad, India. {murali2007tel@yahoo.com 1 , vasukumar_devara@yahoo.co.in 2 } ABSTRACT The primary goal of software development is to deliver Optimal Software, i.e., software produced at low cost, high quality & productivity and scheduled with in time. In order to achieve this optimal software, programmers generally reuse the existing libraries, rather than developing similar code products right from the scratch. While reusing the libraries, programmers are facing several changes such as many existing libraries are not properly documented and many libraries contain large number of program interfaces (PIs) through which libraries expose their functionality. These challenges lead to certain problems that affect in producing optimal software. The problems such as reuse of existing libraries consumes more time, lack of knowledge on reusage of program interfaces and we can’t generate effective test inputs during white box testing. The first two problems reduce the software productivity where as last one affect on software testing. To resolve these problems, we propose a general framework called Netminer. Netminer contains a code search engine. With the help of code search engine, we can search the available open source code over the internet. In the analysis phase, Netminer automatically compares the specifications of program interfaces with relevant code examples that are available in the internet. In the next phase, Netminer applies data mining techniques on code examples that are collected and identify common patterns. The common patterns represent exact usage of program interfaces. We propose some more approaches based on Netminer. Some approaches help programmers in effectively reusing program interfaces provided by existing libraries. Some approaches identify defects under analysis from the mined specifications and some approaches help in generating test inputs by the use of static and dynamic test generation. Our research study shows that Netminer framework can be effectively used in software engineering for achieving optimal software. Keywords: Software Engineering, Data Mining, Program Interface, Netminer, Algorithms. 1. INTRODUCTION What affects software productivity and how do we improve it? This is a concern near and dear to those who are responsible for researching and developing large software system. The main aim of software development is to produce optimal software efficiently and effectively. In order to attain the optimal software, programmers reuse the existing libraries, rather than developing similar code from the scratch. These libraries include open source libraries such as Eclipse or C#. From 1995 onwards, there is a rapid growth in not only open source libraries but also in reuse of these open source libraries. It is observed from earlier researches, that more than 40% of source files among the projects under analysis include the code from open source libraries. A new programming methodology by reusing libraries is called Opportunistic Software Systems Development (OSSD). Using OSSD, programmers develop systems from readymade components by combining each other. Rather than developing similar code right from the beginning, OSSD reuses the existing libraries. Reuse of existing libraries helps in reducing effort during software maintenance and also increased software productivity. For the reuse of libraries, we considered Object Oriented libraries where inheritance plays a vital role. The functionality of Object Oriented Libraries handled through an interface called Program Interface (PI). In object oriented languages PI is used to represent a set of classes and methods provided by libraries. For effective reusing of existing libraries, programmers need the knowledge of how to use PIs. The following two