Mining Plausible Hypotheses from the Literature
via Meta-Analysis
Jooyong Yi
Ulsan National Institute of Science and Technology
South Korea
jooyongyi@acm.org
Vladimir Ivanov Giancarlo Succi
Innopolis University
Russia
{v.ivanov, g.succi}@innopolis.ru
Abstract—Meta-analysis is highly advocated in many fields
of empirical research such as medicine and psychology, due
to its capability to synthesize quantitative evidence of effects
from the literature, based on statistical analysis. However, the
adoption of meta-analysis to software engineering is still suffering
from inertia, despite the fact that many software engineering
researchers have long been arguing the need for it. As an attempt
to move beyond the lockstep, we in this paper explore a different
use of meta-analysis. Our proposition is that meta-analysis is
useful for mining hypotheses because their plausibility is backed
by evidence accumulated in the literature, and thus researchers
could focus their effort on the areas that are of particular need.
We assess our proposition by conducting a lightweight case study
on the literature of defect prediction. We found that three out of
five hypotheses we extract from our meta-analysis were indeed
investigated in separate papers, indicating the usefulness of our
approach. We also recognize two uninvestigated hypotheses whose
validity we plan to investigate in the future.
I. I NTRODUCTION
In this paper, we provide a new perspective on an existing
idea we believe worth of much more attention. The existing
idea we advocate here is meta-analysis. Meta-analysis refers
to the statistical analysis and synthesis of results from a series
of primary studies, i.e., individual studies under investiga-
tion [3], [10]. Common use of meta-analysis is to synthesize
consolidated evidence from a body of primary studies often
containing conflicting results.
Meta-analysis has been advocated and widely used in
many fields of empirical research, such as medicine [10],
psychology [1], education [5], and business [13], enabling
evidence-based practice and decision-making. In software en-
gineering, despite the community-wide attention to Empirical
and Evidence-Based Software Engineering movement initiated
by Victor Basili, Barry Boehm, and Dieter Rombach in the
80’s [2], the generalization of findings across experiment in
general, and the adoption of meta-analysis specifically remain
limited to a fraction of papers such as [22], [25].
Further adoption of meta-analysis to software engineering
research has been called for by many researchers, for example
in [6], [14], [20], [24], to name a few. Our work is in line
with those pieces of work, but also moves beyond them, by
advocating the new use of meta-analysis for mining hypotheses
whose plausibility is backed by evidence accumulated in the
literature. Considering that setting up a proper hypothesis is
a critical first step of empirical research, and performing a
hypothesis test typically takes substantial amount of time of
researchers, it is our proposition that using meta-analysis to
mine a hypothesis would produce a significant advance of the
discipline, since researchers could focus their effort on the
areas that are of particular need.
How can hypotheses be mined from the literature via meta-
analysis? As will be shown with a concrete example in this
paper, we make use of two standard mechanisms of meta-
analysis: a forest plot and subgroup analysis (an example is
shown in Fig. 1). With a forest plot, one can see whether or not
similar results are observed across multiple studies in a consis-
tent manner. If that is the case, the observed consistency can
form a plausible hypothesis. Otherwise, subgroup analysis can
be performed to see which factors contribute to inconsistent
observations, and a hypothesis can be made accordingly for
each of those factors.
In other fields such as medicine, meta-analysis has already
been used in suggesting a new hypothesis [7]. Our key con-
tribution in this paper is to provide early assessment of using
meta-analysis in generating evidence-backed hypotheses in the
context of software engineering. We assess our proposition by
(1) extracting five hypotheses from the literature on defect
prediction via meta-analysis, and (2) subsequently checking
whether those hypotheses were indeed tested in separate
papers in the literature. The rationale of the second step
is as follows. The fact that a hypothesis was tested in the
literature indicates its worthiness for consideration. We found
that two hypotheses were indeed studied in recent papers,
one hypothesis was partially investigated, and two hypotheses
remain to be studied.
II. BACKGROUND
A common way to perform a meta-analysis is to construct
a forest plot such as Fig. 1. Information that can be shown
with a forest plot includes the following.
• Individual Studies: In Fig. 1, each square represents the
mean effect size of the corresponding study, with its size
proportional to its weight reflecting the precision of the
study. Meanwhile, each horizontal line crossing across a
square represents the 95% confidence interval of the corre-
sponding study.
• Summary Effect: In Fig. 1, the diamond in the bottom
represents the summary effect, that is, the weighted mean of
33
2019 IEEE/ACM 41st International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-
NIER)
978-1-7281-1758-4/19/$31.00 ©2019 IEEE
DOI 10.1109/ICSE-NIER.2019.00017