Vol 8. No. 1 Issue 2 – May, 2015 African Journal of Computing & ICT © 2015 Afr J Comp & ICT – All Rights Reserved - ISSN 2006-1781 www.ajocict.net 39 An Improved Technique for the Removal and Replacement of the Inconsistencies in Numeric Dataset J. Abdul-Hadi Department of Mathematics, Bauchi State University, Gadau , Nigeria. jamcy98@gmail.com A.R. Ajiboye Department of Computer Science, University of Ilorin, Ilorin, Nigeria. ajibabdulraheem@gmail.com A. Abba, Department of Statistics, Abubakar Tafawa Balewa University, Bauchi, Nigeria. abdulhafeezabba@gmail.com ABSTRACT The task of ensuring the removal of anomalies in an unclean numeric dataset, with a view to putting the data in a suitable format for exploration purposes is a major phase in the data mining process. In the process of exploring an unclean numeric dataset to unveil their useful patterns or structure, a thorough pre-processing task is inevitable in order to achieve a noise-free dataset. Poor quality data can be misleading if analysed or used to build models, hence, there is need to remove discrepancies that may be present in the data prior to exploring them. In this paper, a cleaning algorithm is proposed and implemented in order to remove the inconsistencies in a numeric dataset. The implementation of the proposed algorithm uses the Java language and the resulting outputs reveal the efficiency of the proposed approach. In order to evaluate the effectiveness of the proposed algorithm, it is compared to one of the existing methods based on some metrics. The comparisons show that, the proposed technique is efficient and can be used as an alternative technique for the removal of outliers in numeric data. This approach is also found to be reliable as it consistently gives an accurate output that is free of outliers. Keywords: Data cleansing, Data mining, Outlier detection, Clustering. African Journal of Computing & ICT Reference Format: J. Abdul-Hadi., A.R. Ajiboye & A. Abba (2015): An Improved Technique for the Removal and Replacement of the Inconsistencies in Numeric Dataset. Afr J. of Comp & ICTs. Vol 8, No. 1, Issue 1. Pp 39-44 1. INTRODUCTION Pre-processing is the task performed on the dataset in order to make it suitable for exploration. Data cleansing, data cleaning and data scrubbing are sometimes used interchangeably to describe the pre-processing task of putting the data in a clean state [1]. The real world data are sometimes incomplete or noisy and it is very rare to get a perfect data. Exploration or analysis of unclean dataset has every tendency to give a result that deviate slightly from what supposed to be the actual results. This is because the presence of anomalies in the data is capable of influencing the resulting outputs. As reported in [2], the use of quality data is crucial to getting high-quality patterns. Putting several files together can ease exploration processes, as efforts to reveal the patterns and structure of the data would be more focused on a single database. However, integration of files from different sources is prone to duplication of records, and human errors in the course of entering data may sometimes violate the declared integrity constraints [3]. Some of the basic tasks that is performed in the process of preparing data generally involves correcting any errors typically emanates from human and/or machine input, filling in nulls and incomplete data. Manually filling of the missing value would, however, cause monotonous within a very short time, which may also lead to some new errors.