FUOYE Journal of Engineering and Technology, Volume 3, Issue 2, September 2018 ISSN: 2579-0625 (Online), 2579-0617 (Paper)
FUOYEJET © 2018 50
http://dx.doi.org/10.46792/fuoyejet.v3i2.200 engineering.fuoye.edu.ng/journal
Software Defect Prediction Using Ensemble Learning: An ANP Based
Evaluation Method
*Abdullateef O. Balogun, Amos O. Bajeh,Victor A. Orie and Ayisat W. Yusuf-Asaju
Department of Computer Science, University of Ilorin, Nigeria
{balogun.ao1|bajehamos|yusuf.aaw}@unilorin.edu.ng|orievictor123@gmail.com
Abstract— Software defect prediction (SDP) is the process of predicting defects in software modules, it identifies the modules that are
defective and require extensive testing. Classification algorithms that help to predict software defects play a major role in software
engineering process. Some studies have depicted that the use of ensembles is often more accurate than using single classifiers. However,
variations exist from studies, which posited that the efficiency of learning algorithms might vary using different performance measures. This
is because most studies on SDP consider the accuracy of the model or classifier above other performance metrics. This paper evaluated
the performance of single classifiers (SMO, MLP, kNN and Decision Tree) and ensembles (Bagging, Boosting, Stacking and Voting) in SDP
considering major performance metrics using Analytic Network Process (ANP) multi-criteria decision method. The experiment was based on
11 performance metrics over 11 software defect datasets. Boosted SMO, Voting and Stacking Ensemble methods ranked highest with a
priority level of 0.0493, 0.0493 and 0.0445 respectively. Decision tree ranked highest in single classifiers with 0.0410. These clearly show
that ensemble methods can give better classification results in SDP and Boosting method gave the best result. In essence, it is valid to say
that before deciding which model or classifier is better for software defect prediction, all performance metrics should be considered.
Keywords— Data mining, Machine Learning, Multi Criteria Decision Making, Software Defect Prediction
—————————— ◆ ——————————
1 INTRODUCTION
oftware engineering is an engineering discipline that
is concerned with all aspects of producing software
from the early stages of software specification
through to maintaining the system after it has gone into
use (Lan, 2009). In any area of software engineering,
errors are mostly inescapable and they can lead to
defects in software. Usually, during the development
process, software defects are discovered during software
testing (Hui, 2014). A software defect is an error or flaw
in a software program or system that causes the
production of an unwanted result. A software defect can
also be the case when the final software product does not
meet the customer requirement or user expectation
(Aruna, Radhika, & Swathi, 2016). Defects can increase
the cost of software development and decrease the
overall quality of the software product. Over the years,
researchers have developed classification models for the
prediction of defects in software. Some studies showed
that the use of ensemble methods are better than single
classifiers in software defect prediction (Yi, Gang,
Guoxun, Wenshuai, & Yong, 2011; Lessman, Baesans,
Meus, & Pietsch, 2008), while some other works
indicated that single classifiers perform better (Bowes,
Hall & Petrić, 2017; Aleem, Capretz & Ahmed, 2015). This
study is aimed at evaluating the performance of
ensemble and classification models using Analytic
Network Process (ANP) which is a multi-criteria
decision-making technique.
The rest of this paper is organized as follows: Section 2
presents a review of related works. Section 3 discusses
the theoretical background of the study. Thus, it presents
the classifiers, feature selection method, ensemble
methods and ANP. Section 4 presents the research
method used in the experiment and analyzes the results.
Section 5 presents results and discussion. Section 6
concludes the paper and presents some
recommendations based on the results of the study.
* Corresponding Author
2 RELATED WORKS
A lot of work has been carried out on software defect
prediction; this section highlights research work
involving defect prediction, feature selection, ensemble
and Multi-criteria decision-making (MCDM). Aleem,
Capretz & Ahmed (2015) in their study, covered different
machine learning methods that can be used for defect
prediction. The performance of different algorithms on
various software datasets was analyzed. SVM and MLP
techniques performed well on bug’s datasets. In order to
select the appropriate method for bug’s prediction
domain experts have to consider various factors such as
the type of datasets, problem domain, uncertainty in
datasets or the nature of the project.
Feature selection has also been applied by researchers to
software defect prediction. Ghotra, McIntosh, & Hassan,
(2017) studied 30 feature selection techniques and 21
classification techniques when applied to 18 datasets
from the NASA and PROMISE corpora. Their results
showed that a correlation-based filter-subset feature
selection technique with a BestFirst search method
outperforms other feature selection techniques across the
studied datasets and across the studied classification
techniques. They recommended the application of such a
selection technique when building defect classification
models.
Issam, Mohammad, & Lahouari, (2014), depicted the
outcome of combining feature selection and ensemble
learning on the performance of defect classification.
They combined selected ensemble learning models with
efficient feature selection on the datasets based on defect
classification performance measures, the results of their
study showed that features of a software defect dataset
must be carefully selected for precise classification of
defective modules.
In another study, Yi et al. (2010) incorporated a set of
MCDM methods to rank classification algorithms, the
study used four MCDM methods to rank 38
classification algorithms based on 13 evaluation criteria
S