International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 03 Issue: 04 | Apr-2016 www.irjet.net p-ISSN: 2395-0072 © 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 75 Minimizing Loss of Accuracy for Seismic Hazard Prediction using Naive Bayes Classifier Kalyan Netti 1 , Dr. Y Radhika 2 1 Senior Scientist, NGRI, Hyderabad 2 SAssociate Professor, GITAM University, Visakhapatnam ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Classification is the one of the most important techniques in Datamining for data analysis. In Datamining, different Classification Techniques are available to predict the outcome for a given dataset. There are many classification methods for predicting and estimating accuracy; one such famous method is Naïve Bayes Classifier. Naïve Bayes is very popular as it is easy to build, however to the assumption of conditional independence among predictor’s results in loss of accuracy. In this paper, we propose a technique to minimize loss of accuracy when predicting Seismic Hazard Activity. Hazard indicates a possible threat to life, health, property and environment. Mitigation of hazard when crossing stipulated level is paramount; otherwise, it may lead to an emergency. One of the most dangerous hazards in mining activities is Mining Hazard. Mineral, Diamonds/Gold and Coal exploration involves mining in a big way where hazard occurrence is quite common and addressing this mining hazard are a challenging task. A substantial threat of Mining Hazard is Seismic Hazard which is normal in underground mines. Thus, Predicting Seismic Hazard is one of the most important aspects in countering Mining Hazards. In this paper, the authors are proposing a novel method for minimizing loss of accuracy in Naïve Bayes Classifier. The proposed novel technique used in NBC gave better accuracy even with Conditional Independence Key Words: Data Mining, Classification, Naïve Bayes Classifier, Conditional Independence, Accuracy, Hazard, Seismic Hazard 1. INTRODUCTION Data Mining is a process of extracting useful and relevant information from data [1]. There are many techniques in Data Mining to extract information from data. With different advanced technologies employed in the areas of engineering, finance, health, etc., the data collected/accumulated, resulting in the exploration, is increasing exponentially. Now, with all the massive amounts of data available, the primary task is to understand the data and extract knowledge from the data. In the current scenario, obtaining useful information from massive amounts of data is a complex task and need very efficient algorithms/techniques. This area is explored in a big way by employing new processes, methods along with statistical techniques. One such effective technique in Data Mining is Classification. There are many Classification techniques available; like Bayesian Networks, Decision Trees, Nearest Neighbour, and Neural Networks. In general, classification is one of the analysis techniques, used to derive models by bringing in prior observations to predict the outcome. Naïve Bayes classifier (NBC) is a very modern and efficient method in data classification. Naïve Bayes, a Supervised Classification Technique, is an effective one because it is easy to build, computationally not sophisticated and is capable of handling massive datasets. Moreover, Naïve Bayes Classifier performs well compared to other predictive models as it assumes conditional Independence among predictors [2][4][7]. One of the main reasons for better performance of Naïve Bayes Classifier is the assumption of independence among predictors. This very assumption of Independence sometimes leads to loss of accuracy in Naïve Bayes Classifier. The loss of accuracy can be more when data sets have attributes with strong inter-relation among themselves. Thus, improving Naive Bayes classifier with the assumption of Independence among predictors is a challenging task [4] [5]. In this paper, seismic hazard data downloaded from UCI repository[6] is chosen to estimate accuracy hazard occurrence using NBC. There is an urgent need to mitigate seismic hazard event and classification techniques may address this issue. One of the most dangerous hazards is Mining Hazard who is common in mining activities. A substantial threat of Mining Hazard is Seismic Hazards which is normal in underground mines. Thus, Predicting Seismic Hazard is one of the most important aspects in countering Mining Hazards and any loss of accuracy in the algorithm used will result in wrong estimation of hazard. In this paper, the authors present a novel method to minimize this loss of accuracy in Naïve Bayes Classifier due to the assumption of Independence among predictors [2][3]. The experimental results show that the proposed method performed well and improved the accuracy when compared to the traditional Naïve Bayes Classifier. The next Section i.e. Section-II, discusses the Naïve Bayes Classifier, Data Set is explained in Section-III; Implementation is presented in Section IV, Section-V discuss the results and the last section presents the conclusions.