Water Consumption Demand Pattern Analysis using Uncertain Smart Water Meter Data Milad Khaki 12 a and Nasim Mortazavi 2 b 1 Electrical and Computer Engineering Department, University of Waterloo, Waterloo, Canada 2 Robarts Research Institute, Western University, London, Canada Keywords: Smart Water Meters, Data Mining, Big Data, Big Data Errors, Case-based Reasoning. Abstract: Wireless ‘smart’ water meters that allow functionalities such as demand response, leak alerts, identification of characteristic demand patterns, and detailed consumption analysis are becoming an essential part of water infrastructure in many countries. To achieve these benefits, the meter data needs to be error-free, which is not necessarily available in practice due to ‘dirtiness’ or ‘uncertainty’ of data, which is mostly unavoidable. Additionally, by analyzing the smart meter data and finding demand patterns, it is possible to provide insights to the municipalities to improve their distribution network, better understand demand characteristics, identify the consumers that are the main sources of shaping the high consumption peaks. This paper investigates solutions to mine the uncertain data, ensures the validity of results, and evaluates the impact of dirty data on data analysis results. Once the reliability of results is ensured, the evaluation results can be used for informed decision-making on water planning strategies. Secondly, the consumption pattern of a city equipped with 25 thousand water consumers is analyzed, and weekly consumption profiles over an entire year are presented for single-family residential consumers. Additionally, a systematic study of the errors existing in large-scale smart water meter deployments is performed to better understand the nature of errors in such data sources, particularly at the first stages of implementation of smart metering infrastructure. Also, the sensitivity of the results to various types of errors in a big data system is presented and investigated. 1 INTRODUCTION As a cost-saving measure, many municipalities have decided to install wireless ‘smart’ water meters that, in addition to all other benefits, primarily enable them to read meters remotely. Toronto and Saskatoon, in Canada, and Baltimore and Pittsburgh, in the United States A substantial fraction of data obtained from virtually all large-scale meter deployments can be incorrect (such as examples in (Quilumba, F.L. and Wei-Jen Lee and Heng Huang and Wang, D.Y. and Szabados, R., 2014), (Liu et al., 2018), (Shishido, Juan, 2012), (Kaisler, Stephen and Armour, Frank and Espinosa, J Alberto and Money, William, 2013), (Sivarajah et al., 2017), (Chen et al., 2013), (Lon House, 2011), and (Courtney, 2014)). The focus of this paper is to highlight the detri- mental effects of data errors in reducing the benefits of using the concept of big data. The impact of uncer- tain data on the identification of customers contribut- a https://orcid.org/0000-0003-0566-727X b https://orcid.org/0000-0002-3257-2463 ing to a peak load is examined to evaluate the data quality. The proposed progressive approach helps to determine errors, their origins and find solutions to remove them. Essentially, data cleaning or data qual- ity evaluation must precede any data analysis from smart meter data. The contributions of this work can be summarized as (1) a systematic study of the errors existing in large-scale smart water meter deployments and water literature, (2) proposing a progressive data cleaning approach to the problem of finding errors in smart meter data (3) a careful study of the impact of dirty data on peak load attribution, and (4) introduc- ing and classification of techniques available for re- moving errors from dirty data; including those meth- ods applied in this study. The remainder of the paper is structured as follows. Section 2 provides a gen- eralized model of smart water meter infrastructure. The progressive data cleaning approach is presented in Section 3, the data quality issues mainly encoun- tered in the current study, together with the adopted or produced solutions. As the final part of the case study, the results of using the cleaned dataset are pre- 436 Khaki, M. and Mortazavi, N. Water Consumption Demand Pattern Analysis using Uncertain Smart Water Meter Data. DOI: 10.5220/0010834900003116 In Proceedings of the 14th International Conference on Agents and Artificial Intelligence (ICAART 2022) - Volume 3, pages 436-443 ISBN: 978-989-758-547-0; ISSN: 2184-433X Copyright c 2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved