VOL. 2, NO. 11, October 2011 ISSN 2079-8407
Journal of Emerging Trends in Computing and Information Sciences
©2009-2011 CIS Journal. All rights reserved.
http://www.cisjournal.org
598
A Study of Mining Software Engineering Data and
Software Testing
T.Murali Krishna
1
, Devara Vasumathi
2
1. Lecturer, Department of Computer Science, College of Engineering & Technology, Jimma University, Jimma, Ethiopia
2. Associate. Professor, Department of Computer Science &Engineering, College of Engineering & Technology,
Jawaharlal Nehru Technological University, Hyderabad, India.
{murali2007tel@yahoo.com
1
, vasukumar_devara@yahoo.co.in
2
}
ABSTRACT
The primary goal of software development is to deliver Optimal Software, i.e., software produced at low cost, high
quality & productivity and scheduled with in time. In order to achieve this optimal software, programmers generally
reuse the existing libraries, rather than developing similar code products right from the scratch. While reusing the
libraries, programmers are facing several changes such as many existing libraries are not properly documented and
many libraries contain large number of program interfaces (PIs) through which libraries expose their functionality.
These challenges lead to certain problems that affect in producing optimal software. The problems such as reuse of
existing libraries consumes more time, lack of knowledge on reusage of program interfaces and we can’t generate
effective test inputs during white box testing. The first two problems reduce the software productivity where as last
one affect on software testing.
To resolve these problems, we propose a general framework called Netminer. Netminer contains a code search
engine. With the help of code search engine, we can search the available open source code over the internet. In the
analysis phase, Netminer automatically compares the specifications of program interfaces with relevant code
examples that are available in the internet. In the next phase, Netminer applies data mining techniques on code
examples that are collected and identify common patterns. The common patterns represent exact usage of program
interfaces.
We propose some more approaches based on Netminer. Some approaches help programmers in effectively reusing
program interfaces provided by existing libraries. Some approaches identify defects under analysis from the mined
specifications and some approaches help in generating test inputs by the use of static and dynamic test generation.
Our research study shows that Netminer framework can be effectively used in software engineering for achieving
optimal software.
Keywords: Software Engineering, Data Mining, Program Interface, Netminer, Algorithms.
1. INTRODUCTION
What affects software productivity and how do we
improve it? This is a concern near and dear to those who
are responsible for researching and developing large
software system. The main aim of software development
is to produce optimal software efficiently and effectively.
In order to attain the optimal software, programmers reuse
the existing libraries, rather than developing similar code
from the scratch. These libraries include open source
libraries such as Eclipse or C#. From 1995 onwards, there
is a rapid growth in not only open source libraries but also
in reuse of these open source libraries.
It is observed from earlier researches, that more
than 40% of source files among the projects under analysis
include the code from open source libraries. A new
programming methodology by reusing libraries is called
Opportunistic Software Systems Development (OSSD).
Using OSSD, programmers develop systems from
readymade components by combining each other. Rather
than developing similar code right from the beginning,
OSSD reuses the existing libraries. Reuse of existing
libraries helps in reducing effort during software
maintenance and also increased software productivity.
For the reuse of libraries, we considered Object
Oriented libraries where inheritance plays a vital role.
The functionality of Object Oriented Libraries handled
through an interface called Program Interface (PI). In
object oriented languages PI is used to represent a set of
classes and methods provided by libraries. For effective
reusing of existing libraries, programmers need the
knowledge of how to use PIs. The following two