A Novel Cleansing Method for Random-Walk Data using Extended Multivariate Nonlinear Regression: A D {tag} {/tag} International Journal of Computer Applications Foundation of Computer Science (FCS), NY, USA Volume 183 - Number 16 Year of Publication: 2021 Authors: Hussein Bakiri, Hamisi Ndyetabura, Libe Massawe, Hellen Maziku 10.5120/ijca2021921503 {bibtex}2021921503.bib{/bibtex} Abstract The efficiency of any load forecasting mechanism depends on the quality and distribution characteristics of the training data. Outliers and missing values are the primary concern, especially in developing countries’ load data. Several research works have proposed the models for the imputation process to deal with outliers before forecasting. However, the efficiency of these approaches is compromised when it comes to data that falls into a random-walk distribution. Thus, this study aims to develop an efficient data cleansing model that accounts for a random-walk distributionby extending the Multivariate Nonlinear Regression (MNLR) method. The k-mean algorithm is used to detect and analyze the size of an outlier in the data. Twenty-minutes interval load data from 2015 to 2019 collected at Kinondoni-North (at Mikocheni distribution network in Dar es salaam) is used in this study. After analyzing the data for outliers, the empirical results detect the presence of outliers by 5.17852% (which is 5207 out of 105192). Finally, the extended-MNLR (e-MNLR) modelachieves promising results over the ANN, SVM, Miss Forest, MICE, and KNN algorithms by attaining 2.109137, 1.956039, and 1 / 4