Permutation k-sample Goodness-of-Fit Test
for Fuzzy Data
Przemyslaw Grzegorzewski
Faculty of Mathematics and Information Science, Warsaw University of Technology,
Koszykowa 75, 00-662 Warsaw, Poland;
and Systems Research Institute, Polish Academy of Sciences,
Newelska 6, 01-447 Warsaw, Poland
pgrzeg@mini.pw.edu.pl
Abstract—The problem of testing goodness-of-fit for k distri-
butions based on fuzzy data is considered. A new permutation
test for fuzzy random variables is proposed. Besides the general
constrution of the test an algorithm ready for the practical use
is delivered. A case-study illustrating the applicability of the
suggested testing procedure is also presented.
Index Terms—fuzzy data, fuzzy number, fuzzy random vari-
able, goodness-of-fit test, permutation test, random fuzzy number,
trapezoidal fuzzy number
I. I NTRODUCTION
Most of statistical procedures are constructed with fairly
specific assumptions regarding the underlying population dis-
tribution. In particular, one of most often used techniques, ap-
plied for comparing several treatments, i.e. analysis of variance
(ANOVA), assumes not only independence of observations
and that all populations are normally distributed but also the
homogeneity of their variances. Obviously, so strong assump-
tions quite often are not satisfied. Unfortunately, ANOVA, like
some other statistical tests is sensitive to violations of the
fundamental model assumptions inherent in its derivation. In
such case distribution-free methods, also called nonparametric,
are very useful. In particular, the Kruskal-Wallis test could
be used for comparing a few independent samples. This test,
unlike ANOVA, requires neither normality nor homogeneity
of variances and that is why it is sometimes called the
nonparametric analogue of one-way analysis of variance.
Real-life data sets often consist of imprecise or vague
observations. In particular, many human ratings based on
opinions or associated with perceptions lead to data that
cannot be expressed in a numerical scale. Such data consist
of intrinsically imprecise or fuzzy elements. Thus, if they also
appear as a realization of some random experiment, we are
faced with random fuzzy variables that cannot be analyzed
by classical statistical methods and require another adequate
approach.
Several approaches have been developed in the literature for
testing statistical hypotheses with fuzzy data. Depending on
the context and whether data are perceived from the epistemic
or the ontic view (see [4]), various test constructions appeared
in the literature (for the overview we refer the reader e.g. to
[8], [9], [12], [13]–[15], [17], [22], [26], [27], [28], [33], [32]).
In particular, the problem of testing the equality of k samples
against the so-called “simple-tree alternative” or “many-one
problem” for fuzzy data based on the necessity index of strict
dominance is considered in [17]. The bootstrap test for the
equality of fuzzy means of k populations can be found in [9],
while [33] contains the bootstrap procedure for testing the
homoscedasticity of k populations.
One may wonder why the nonparametric Kruskal-Wallis
test has not been generalized for fuzzy data. The reason is
that the Kruskal-Wallis test is based on ranks which cannot
be determined for fuzzy samples since fuzzy numbers are not
linearly ordered. However, the Kruskal-Wallis test is actually
the k-sample goodness-of-fit test, since its null hypothesis
states that all k samples under study actually come from the
same distribution (which is equivalent to the statement that all
k populations are identically distributed). Therefore, one may
consider another construction of the k-sample goodness of fit
test, which does not need any ranks. Such construction based
on permutations is suggested in this very contribution.
The paper is organized as follows: in Sec. II we introduce
the notation and recall basic concepts related to fuzzy data
modeling and operations on fuzzy numbers. Sec. III is devoted
to fuzzy random variables. In Sec. IV we propose the general
idea of the k-sample goodness-of-fit permutation test for fuzzy
data. Besides the test construction we deliver testing algorithm
ready for a practical use. Next, in Sec. V we adapt the general
construction of the suggested test for the trapezoidal fuzzy
numbers. Then we present some results of the simulation study
(Sec. VI) and the case study (Sec. VII) with the proposed
test. Finally, conclusions and some indications for the further
research are given in Sec. VIII.
II. FUZZY DATA
A fuzzy number is an imprecise value characterized by a
mapping A : R → [0, 1] (called a membership function) such
that its α-cut defined by
A
α
=
{x ∈ R : A(x) α} if α ∈ (0, 1],
cl{x ∈ R : A(x) > 0} if α =0,
(1)
is a nonempty compact interval for each α ∈ [0, 1]. Operator
cl in (1) stands for the closure. Thus every fuzzy number is
completely characterized both by its memberschip function
A(x) and by a family of its α-cuts {A
α
}
α∈[0,1]
. Two α-cuts
978-1-7281-6932-3/20/$31.00 ©2020 IEEE