Citation: Offiong, N.M.; Memon,
F.A.; Wu, Y. Time Series Data
Preparation for Failure Prediction in
Smart Water Taps (SWT).
Sustainability 2023, 15, 6083. https://
doi.org/10.3390/su15076083
Academic Editors: Ximing Cai and
Erhu Du
Received: 13 January 2023
Revised: 17 February 2023
Accepted: 22 February 2023
Published: 31 March 2023
Copyright: © 2023 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
sustainability
Article
Time Series Data Preparation for Failure Prediction in Smart
Water Taps (SWT)
Nsikak Mitchel Offiong
1,
* , Fayyaz Ali Memon
1
and Yulei Wu
2
1
Centre for Water Systems, University of Exeter, Exeter EX4 4QF, UK
2
Department of Computer Science, EMPS, University of Exeter, Exeter EX4 4QF, UK
* Correspondence: no270@exeter.ac.uk
Abstract: Smart water tap (SWT) time series model development for failure prediction requires
acquiring data on the variables of interest to researchers, planners, engineers and decision makers.
Thus, the data are expected to be ‘noiseless’ (i.e., without discrepancies such as missing data, data
redundancy and data duplication) raw inputs for modelling and forecasting tasks. However, historical
datasets acquired from the SWTs contain data discrepancies that require preparation before applying
the dataset to develop a failure prediction model. This paper presents a combination of the generative
adversarial network (GAN) and the bidirectional gated recurrent unit (BiGRU) techniques for missing
data imputation. The GAN aids in training the SWT data trend and distribution, enabling the
imputed data to be closely similar to the historical dataset. On the other hand, the BiGRU was
adopted to save computational time by combining the model’s cell state and hidden state during data
imputation. After data imputation there were outliers, and the exponential smoothing method was
used to balance the data. The result shows that this method can be applied in time series systems
to correct missing values in a dataset, thereby mitigating data noise that can lead to a biased failure
prediction model. Furthermore, when evaluated using different sets of historical SWT data, the
method proved reliable for missing data imputation and achieved better training time than the
traditional data imputation method.
Keywords: missing data; generative adversarial network; bidirectional gated recurrent unit; smart
water tap; failure prediction; data imputation
1. Introduction
A sustainable solution for rural water delivery requires accurate water infrastructure
assessment and efficient data processing techniques. These techniques need data, which
should come from regular usage of the water infrastructure. However, most rural water in-
stallations lack accurate data from the available repository [1]. Therefore, with inadequate,
partial, or missing data regarding the smart water taps, it is difficult to develop a compre-
hensive failure prediction model or an early warning system. Furthermore, investment
in extensive inspection and data-gathering programmes on smart rural taps to overcome
data gaps may not be financially feasible for rural water management agencies [2]. So, to
achieve failure prediction for rural water taps, the available time series data generated from
the system usage and data manipulation is sufficient for critical analysis irrespective of
the discrepancies.
Part of the aim of this paper is to develop a failure prediction model for smart water
taps to support proactive maintenance, which can help provide a sustainable water supply
to rural communities in sub-Saharan Africa and similar contexts. Solar-powered smart
water taps (SWT) deployed to rural areas in some parts of Africa are perceived as low-cost
and reliable water supply sources for domestic use in the region. These SWTs, often referred
to as e-taps, dispense water when a pre-paid token comes in contact with them. During
their functional time, the smart taps generate time series datasets that can be analysed
Sustainability 2023, 15, 6083. https://doi.org/10.3390/su15076083 https://www.mdpi.com/journal/sustainability