410 Int. J. Computing Science and Mathematics, Vol. 7, No. 5, 2016
Copyright © 2016 Inderscience Enterprises Ltd.
Significance of non-parametric statistical tests for
comparison of classifiers over multiple datasets
Pawan Kumar Singh*, Ram Sarkar and
Mita Nasipuri
Department of Computer Science and Engineering,
Jadavpur University,
188, Raja S.C. Mullick Road,
Kolkata-700032, West Bengal, India
E-mail: pawansingh.ju@gmail.com
E-mail: raamsarkar@gmail.com
E-mail: mitanasipuri@gmail.com
*Corresponding author
Abstract: In machine learning, generation of new algorithms or, in most cases,
minor amendment of the existing ones is a common task. In such cases, a
rigorous and correct statistical analysis of the results of different algorithms is
necessary in order to select the exact technique(s) depending on the problem to
be solved. The main inconvenience related to this necessity is the absence of
proper compilation of statistical techniques. In this paper, we propose the use of
two important non-parametric statistical tests, namely, Wilcoxon signed rank
test for comparison of two classifiers and Friedman test with the corresponding
post-hoc tests for comparison of multiple classifiers over multiple datasets. We
also introduce a new variant of non-parametric test known as Scheffe’s test for
locating unequal pairs of means of performances of multiple classifiers when
the given datasets are of unequal sizes. The parametric tests, which were
previously being used for comparing multiple classifiers, have also been
described in brief. The proposed non-parametric tests have also been applied on
the classification results on ten real-problem datasets taken from the
UCI Machine Learning Database Repository (http://www.ics.uci.edu/mlearn)
(Valdovinos and Sanchez, 2009) as case studies.
Keywords: statistical comparison; non-parametric test; Scheffe’s test;
Wilcoxon-signed rank test; Friedman test; post-hoc test.
Reference to this paper should be made as follows: Singh, P.K., Sarkar, R.
and Nasipuri, M. (2016) ‘Significance of non-parametric statistical tests for
comparison of classifiers over multiple datasets’, Int. J. Computing Science and
Mathematics, Vol. 7, No. 5, pp.410–442.
Biographical notes: Pawan Kumar Singh received his BTech in Information
Technology from West Bengal University of Technology in 2010. He received
his MTech degree from Jadavpur University (JU) in 2013. He is currently
pursuing his PhD degree at JU. His areas of current research interest are pattern
recognition, handwritten document analysis, image processing, bioinformatics
and artificial intelligence.