1949-3053 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2018.2832840, IEEE Transactions on Smart Grid 1 Abstract--This paper presents an approach to reduce the multi- ple estimation effect of fault location algorithms. This effect occurs in fault location techniques based on fault distance estimation con- cerning radial distribution feeders. This approach is based on a Data Mining technique called DAMICORE. It is important to mention that this tool is executed from the perspective of Cloud Computing and in the context of Smart Grids. It is noteworthy that the voltage and current signals are received by a cloud using smart meters and disturbance recorders. Thus, this cloud receives a fea- ture vector that is extracted by them from the signals acquired. Considering this, the cloud becomes responsible for executing the DAMICORE which, in turn, defines relations among the faulty events. The IEEE 34-Bus Test Feeder was simulated as the test case system. Moreover, the data mining process was able to reduce errors due to the multiple estimation of faulty branches. Index Terms--Cloud computing, DAMICORE, data mining, multiple estimation, fault location, feature extraction, smart grids. I. INTRODUCTION HE growing concern about environmental issues (in order to reduce carbon emissions) and the consequent increase in incentives to use alternative energy sources have given a new direction to the technological evolution in power systems. This fact is reinforced by the deregulation of the energy sector and access to the free energy market. The context of Smart Grids arises specifically from the need to meet the interests of the agents involved in this new process [1], [2]. In general terms, Smart Grids can be defined as electrical systems where all (or part of) the equipment, besides the agents involved (consumers and utilities), are interconnected through a communication network, associated with the use of digital technology. This allows consumers to be active agents, stimu- lating improvements in terms of Power Quality, increased reli- ability, as well as improving efficiency of the distribution net- work [3], [4]. On the context of Smart Grids, it is important to consider the infrastructure where data and computational resources are allo- cated, opening up a discussion about the use of Cloud Compu- ting [5]. Cloud Computing can be defined as a new computing paradigm, in which computing resources are grouped into large E. A. Reche, J. V. Sousa, and D. V. Coury are with the Department of Elec- trical and Computer Engineering at São Carlos School of Engineering, Univer- sity of São Paulo, São Carlos, SP 13566-590 Brazil (e-mails: evandrore- che@usp.br, jeovane@usp.br, coury@sc.usp.br). virtual repositories, which are easy to access and use, and they are made available as services. It offers some advantages com- pared to traditional computing [6], [7], such as: more flexibility, scalability, high resource allocation capacity, possibility of re- mote access and increased operational safety. Moreover, it is considered a technology capable of effectively integrating the various domains involved in Smart Grids. Reference [8] ad- dresses the concept of Cloud Computing in the context of Smart Grids, aiming to reduce operational costs, as well as to simplify the system using smart meters. In this scenario, one of the topics that takes advantage of us- ing this new technology is fault location in power distribution feeders, including the possibility of using smart meters for this purpose [9]–[11]. Fault location in transmission systems is not a new subject and some efficient approaches have been pro- posed over the past years [12]. However, most of these algo- rithms are not applicable to radial distribution feeders because of some distinctive characteristics of them. Some authors have developed stand-alone algorithms taking into account topolog- ical characteristics of radial distribution feeders [13]. In gen- eral, classical fault location techniques are based on the estima- tion of the fault distance using impedance-based methods. The pre fault and post fault effective values of the fundamental cur- rents and voltages at power substations are used for the fault location purpose [14][15]-[16]. The main drawback of this type of approach is the multiple estimation problem, where, due to topological characteristics of the distributed network, there are multiple points that fulfill the equivalent impedance condition. Consequently, there is more than one location in the circuit with the same estimated impedance, causing uncertainty to the fault location process. Some authors that have looked at this particu- lar aspect are references [17] and [18]. Morales-España et al. [17] presented a conceptual approach for eliminating the multi- ple estimation problem of impedance-based fault location meth- ods, using the available measurements of current and voltage fundamentals at the power substation. Tests performed showed goods results for different power systems. Krishnathevar and Ngu [18] also presented a generalized one-end impedance based fault location method. The development of new expres- sions for the current distribution factor and the derivation of the R. A. S. Fernandes is with the Department of Electrical Engineering, Fed- eral University of São Carlos, São Carlos, SP 13565-905, Brazil (email: ri- cardo.asf@ufscar.br). Data Mining-Based Method to Reduce Multiple Estimation for Fault Location in Radial Distribution Systems Evandro Agostinho Reche, Jeovane Vicente de Sousa, Denis Vinicius Coury, Member, IEEE, and Ri- cardo Augusto Souza Fernandes, Member, IEEE T