Contents lists available at ScienceDirect
Computational Biology and Chemistry
journal homepage: www.elsevier.com/locate/cbac
Research Article
Intelligent system based on data mining techniques for prediction of preterm
birth for women with cervical cerclage
Hasan Rawashdeh
a
, Shatha Awawdeh
b
, Fatima Shannag
b
, Esraa Henawi
b
, Hossam Faris
b,
*,
Nadim Obeid
b
, Jon Hyett
c
a
Department of Obstetrics and Gynaecology, Jordan University of Science and Technology, Jordan
b
King Abdullah II School for Information Technology, The University of Jordan, Amman, Jordan
c
Discipline of Obstetrics, Gynaecology and Neonatology, University of Sydney, Sydney, Australia
ARTICLE INFO
Keywords:
Preterm birth
Prediction system
Cerclage
Data mining
ABSTRACT
Preterm birth, defined as a delivery before 37 weeks’ gestation, continues to affect 8–15% of all pregnancies and
is associated with significant neonatal morbidity and mortality. Effective prediction of timing of delivery among
women identified to be at significant risk for preterm birth would allow proper implementation of prophylactic
therapeutic interventions. This paper aims first to develop a model that acts as a decision support system for
pregnant women at high risk of delivering prematurely before having cervical cerclage. The model will predict
whether the pregnancy will continue beyond 26 weeks’ gestation and the potential value of adding the cerclage
in prolonging the pregnancy. The second aim is to develop a model that predicts the timing of spontaneous
delivery in this high risk cohort after cerclage. The model will help treating physicians to define the chronology
of management in relation to the risk of preterm birth, reducing the neonatal complications associated with it.
Data from 274 pregnancies managed with cervical cerclage were included. 29 of the procedures involved
multiple pregnancies. To build the first model, a data balancing technique called SMOTE was applied to over-
come the problem of highly imbalanced class distribution in the dataset. After that, four classification models,
namely Decision Tree, Random Forest, K-Nearest Neighbors (K-NN), and Neural Network (NN) were used to
build the prediction model. The results showed that Random Forest classifier gave the best results in terms of G-
mean and sensitivity with values of 0.96 and 1.00, respectively. These results were achieved at an oversampling
ratio of 200%. For the second prediction model, five classification models were used to predict the time of
spontaneous delivery; linear regression, Gaussian process, Random Forest, K-star, and LWL classifier. The
Random Forest classifier performed best, with 0.752 correlation value. In conclusion, computational models can
be developed to predict the need for cerclage and the gestation of delivery after this procedure. These models
have moderate/high sensitivity for clinical application.
1. Introduction
Preterm birth is defined by the World Health Organization (WHO)
as a delivery before 37 completed weeks of gestation (Organization
et al., 1977). This is usually subdivided based on gestational age into:
extremely preterm (< 28 weeks), very preterm (28–31 weeks), and
moderate and late preterm (32–36 weeks). Preterm birth affects about
5–18% of all pregnancies worldwide (Blencowe et al., 2013a). Cur-
rently, it is the leading cause of death under 5 years of age, responsible
for nearly one million deaths in the same age group (Liu et al., 2016),
and its complications are the single largest direct cause of neonatal
deaths, responsible for more than one third of them (Blencowe et al.,
2013a). Furthermore, survivors of preterm birth have significant risk of
ongoing morbidity including neurodevelopmental impairment, cogni-
tive dysfunction, learning difficulties, visual problems, and growth
problems (Blencowe et al., 2013a). For example, neuro developmental
impairment was estimated to affect 52% of newborns at < 28 weeks,
24% of newborns at 28–31 weeks, and 5% of newborns at 32–36 weeks
(Blencowe et al., 2013b). From economic point of view, preterm birth
places heavy burden on families, national health services and health
insurance agencies. For instance, the annual economic burden asso-
ciated with preterm birth in the United States was 26.2 billion dollars in
https://doi.org/10.1016/j.compbiolchem.2020.107233
Received 24 December 2019; Received in revised form 7 February 2020; Accepted 8 February 2020
⁎
Corresponding author.
E-mail addresses: hmrawashdeh@just.edu.jo (H. Rawashdeh), sda9170256@fgs.ju.edu.jo (S. Awawdeh), fat9170271@fgs.ju.edu.jo (F. Shannag),
asr9170277@fgs.ju.edu.jo (E. Henawi), hossam.faris@ju.edu.jo (H. Faris), obein@ju.edu.jo (N. Obeid), jon.hyett@sswahs.nsw.gov.au (J. Hyett).
Computational Biology and Chemistry 85 (2020) 107233
Available online 15 February 2020
1476-9271/ © 2020 Elsevier Ltd. All rights reserved.
T