International Journal of Advanced Engineering, Management and Science (IJAEMS) [Vol-2, Issue-6, June- 2016] Infogain Publication (Infogainpublication.com ) ISSN: 2454-1311 www.ijaems.com Page | 748 A Hybrid Approach to Power Theft Detection Saurabh Jain, A. M. Karandikar Department of CSE, Ramdeobaba College of Engineering and Management, Nagpur, India Abstract— Currently power theft is a common problem face by all electricity companies. Since power theft directly affect the profit made by electricity companies, theft detection and prevention of electricity is mandatory. In this paper we proposed a hybrid approach to detect the electricity theft i.e. to detect suspected consumers who is doing theft. We use SVM and ELM for our approach. We also compare our approach with KNN. Keywords— ELM, KNN, Power Theft, SVM Classification, Technical Loss, Non-Technical Loss. I. INTRODUCTION We all know power theft is a major problem for all electricity companies. This problem is not related to Indian companies only; other country’s electricity companies also face this problem. Electricity companies losses money every year due to power theft. There are two types of losses namely transmission loss and non-transmission loss. Transmission loss occurs while transmitting energy form generation side to consumer’s side. Following are the some reason for transmission loss occurs: Due to improper insulation. Due to resistance in wire. Non-Transmission losses occur due to wrong billing, false meter reading, electricity theft, etc. First two losses can be prevented by taking proper meter reading and calculating accurate bill for electricity consume, but electricity theft is hard to prevent since no one predict about which consumer is honest or dishonest. Still losses due to electricity theft can be kept minimum by finding fraud consumers. There are various ways through which power theft can be done for example bypassing the meter or tempering with meter readings, etc. Theft detection is done manually by inspecting consumers. This is time consuming process and requires large number of field staff. The cost for this process is too high and detection rate is not so high. To overcome these costs, now a day some data mining techniques are used to detect theft. We are proposing a hybrid approach for detection of theft, which will improve accuracy of detection and requires less cost for whole process. II. BACKGROUND WORK Number of methods are proposed and implemented for finding and estimation of power theft. [1] This paper presents a framework to identify power loss activities. They used automatic feature extraction methods for customer profile with ELM, OS-ELM and SVM to identify customer who is doing fraud. They extracted consumption patterns using data mining and statistical techniques. ELM, OS-ELM and SVM classifies profiles for fraud detection. They use outlier detection to find fraud customer profiles, if outlier find and it is due to power loss activity they use this profile as reference. ELM and OS-ELM used as main classifier for their framework. [3] This paper discusses the problems while doing theft detection and previous ways to reduce the theft. In this paper they developed approximate patterns for classification using customer load profiles. Approximate consumption patterns are designed using load profiles and artificial intelligence tools. Then they trained the SVM to classify data based on the suspicious energy consumption. [4] This paper presents a framework to detect non-technical losses. They use Genetic algorithm and support vector machine for their approach. Their approach selects the suspected customers for onsite investigation so theft can be identified. III. PROPOSED WORK Machine learning explores the study of algorithms that can learn from and make predictions on data. We proposed hybrid approach to find suspected customers who is doing power theft. We have collected data from IT Office of MSEDCL. This data is a collection of 24 months consumption of customer. Dataset consist fields like consumer number, tariff code, connection load, unit consumption of a month, meter status. We separated some part of dataset as training set and some as test dataset (roughly 80% used for training and 20% used for testing purpose). Our approach contains two main phases namely training phase and data classification phase.