Machine Learning-Based Clustering of Load
Profiling to Study the Impact of Electric Vehicles
on Smart Meter Applications
Saeed Ahmed, Zafar Ali Khan
Mirpur University of Science
& Technology, Pakistan
saeed.ahmed@must.edu.pk
Noor Gul
University of Peshawar, Pakistan
noor@uop.edu.pk
Junsu Kim, Su Min Kim
Korea Polytechnic University, Korea
suminkim@kpu.ac.kr
Abstract— The data collected from advanced metering
infrastructure enables the electric utilities to develop a deep insight
about the energy consumption behavior of the consumer. However,
the load signature and consumption pattern varies due to addition of
multiple types of new loads, such as electric vehicles (EVs).
Therefore, it becomes imminent to further dig down these variations.
To this end, this paper investigates the impacts of insertion of EV
profiles in the household level smart meter data. The Irish CER
dataset and EV data from the NREL residential PEV are utilized in
this study to classify the users with and without EVs’ loads. The
results show that change in the cluster membership can help to
separate the consumers with the EV load from the stand-alone
consumers without the EV load.
Keywords— Data clustering; electric vehicles; load profiling;
smart meter
I. INTRODUCTION
The smart meters deliver meticulous knowledge about the
individual consumers’ load patterns that can be further utilized
to control the loads even at individual household [1, 2]. The
challenges faced by the curse of dimensionality of the data can
be managed by classifying the consumers into different classes
to extract typical load profiles using machine learning-based
techniques.
Extraction of load patterns from smart meter data is a
cumbersome process and it can be tackled by supervised or
unsupervised machine learning (ML) techniques [3]. Multiple
studies have been carried out to classify the pattern in the
unlabelled smart meter data, however, impact of integration of
electric vehicles (EVs) on consumer classification is a
promising area.
Information of EV charging may help the electric utilities to
predict load and also to comprehend temporal and spatial
aspects for: 1) load scheduling and 2) Evade distribution
network renovations [4]. The consumers with EVs hide the
purchase of EV usually from utility resulting in shift in their
energy consumption pattern without the knowledge of utility,
leading in wrong categorization of such consumers. Broadly,
non-intrusive load monitoring is employed to disaggregate load
for EV detection. However, it is an complicated technique that
requires high granularity of data at frequency of seconds [5].
In this paper, we investigate the impact of inclusion of EVs
at consumer level considering different diffusion levels of EV
charging profiles. The smart meter data from Irish CER dataset
[6] with 30 minutes resolution is interpolated to 10 minutes
resolution to embed EV charging profiles with 10 minutes
resolution. The profiles with and without EVs are clustered and
changed in a cluster membership due to the inclusion of EVs.
Accordingly, the impact of EVs are investigated in this paper.
The rest of the paper is arranged as follows. Section II
explains the proposed scheme. The case studies and results
with their analysis are presented in Section III. The paper is
concluded in Section IV.
II. METHODOLOGY
This paper aims to investigate the clustering of the load
profiles inclusive of EVs. The Irish CER smart meter dataset
employed in this work [6] contains data snapshots with a
frequency of 30 minutes for more than 5,000 residential and
small business consumers for a period of 18 months. The EV
charging data used in this case is from 2009 RECS data set
provided by NREL [7]. 200 random customers are selected
from the smart meter dataset and similarly, 30 EV charging
profiles are selected from the NREL dataset for case studies. A
flowchart of the proposed scheme adopted for the case studies
is given in Figure 1.
Figure 1: Flowchart Proposed scheme
The proposed scheme is explained as follows:
A. Data Pre-processing
In the data pre-processing stage, the first step is to ensure
the quality of the data. To ensure high data quality, the outliers
are removed, and data is cleansed by removing the erroneous
values. Potential hardware failures in the first month can lead
to zero kWh readings, therefore, all such readings are removed
444 978-1-7281-6476-2/21/$31.00 ©2021 IEEE ICUFN 2021