International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 03 Issue: 04 | Apr-2016 www.irjet.net p-ISSN: 2395-0072
© 2016, IRJET | Impact Factor value: 4.45 | ISO 9001:2008 Certified Journal | Page 75
Minimizing Loss of Accuracy for Seismic Hazard Prediction using Naive
Bayes Classifier
Kalyan Netti
1
, Dr. Y Radhika
2
1
Senior Scientist, NGRI, Hyderabad
2
SAssociate Professor, GITAM University, Visakhapatnam
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Classification is the one of the most important
techniques in Datamining for data analysis. In Datamining,
different Classification Techniques are available to predict the
outcome for a given dataset. There are many classification
methods for predicting and estimating accuracy; one such
famous method is Naïve Bayes Classifier. Naïve Bayes is very
popular as it is easy to build, however to the assumption of
conditional independence among predictor’s results in loss of
accuracy. In this paper, we propose a technique to minimize
loss of accuracy when predicting Seismic Hazard Activity.
Hazard indicates a possible threat to life, health, property and
environment. Mitigation of hazard when crossing stipulated
level is paramount; otherwise, it may lead to an emergency.
One of the most dangerous hazards in mining activities is
Mining Hazard. Mineral, Diamonds/Gold and Coal exploration
involves mining in a big way where hazard occurrence is quite
common and addressing this mining hazard are a challenging
task. A substantial threat of Mining Hazard is Seismic Hazard
which is normal in underground mines. Thus, Predicting
Seismic Hazard is one of the most important aspects in
countering Mining Hazards. In this paper, the authors are
proposing a novel method for minimizing loss of accuracy in
Naïve Bayes Classifier. The proposed novel technique used in
NBC gave better accuracy even with Conditional Independence
Key Words: Data Mining, Classification, Naïve Bayes
Classifier, Conditional Independence, Accuracy, Hazard,
Seismic Hazard
1. INTRODUCTION
Data Mining is a process of extracting useful and relevant
information from data [1]. There are many techniques in
Data Mining to extract information from data. With different
advanced technologies employed in the areas of engineering,
finance, health, etc., the data collected/accumulated,
resulting in the exploration, is increasing exponentially. Now,
with all the massive amounts of data available, the primary
task is to understand the data and extract knowledge from
the data. In the current scenario, obtaining useful
information from massive amounts of data is a complex task
and need very efficient algorithms/techniques. This area is
explored in a big way by employing new processes, methods
along with statistical techniques. One such effective
technique in Data Mining is Classification. There are many
Classification techniques available; like Bayesian Networks,
Decision Trees, Nearest Neighbour, and Neural Networks. In
general, classification is one of the analysis techniques, used
to derive models by bringing in prior observations to predict
the outcome. Naïve Bayes classifier (NBC) is a very modern
and efficient method in data classification. Naïve Bayes, a
Supervised Classification Technique, is an effective one
because it is easy to build, computationally not sophisticated
and is capable of handling massive datasets. Moreover, Naïve
Bayes Classifier performs well compared to other predictive
models as it assumes conditional Independence among
predictors [2][4][7]. One of the main reasons for better
performance of Naïve Bayes Classifier is the assumption of
independence among predictors. This very assumption of
Independence sometimes leads to loss of accuracy in Naïve
Bayes Classifier. The loss of accuracy can be more when data
sets have attributes with strong inter-relation among
themselves. Thus, improving Naive Bayes classifier with the
assumption of Independence among predictors is a
challenging task [4] [5].
In this paper, seismic hazard data downloaded from UCI
repository[6] is chosen to estimate accuracy hazard
occurrence using NBC. There is an urgent need to mitigate
seismic hazard event and classification techniques may
address this issue. One of the most dangerous hazards is
Mining Hazard who is common in mining activities. A
substantial threat of Mining Hazard is Seismic Hazards which
is normal in underground mines. Thus, Predicting Seismic
Hazard is one of the most important aspects in countering
Mining Hazards and any loss of accuracy in the algorithm
used will result in wrong estimation of hazard.
In this paper, the authors present a novel method to
minimize this loss of accuracy in Naïve Bayes Classifier due
to the assumption of Independence among predictors [2][3].
The experimental results show that the proposed method
performed well and improved the accuracy when compared
to the traditional Naïve Bayes Classifier.
The next Section i.e. Section-II, discusses the Naïve Bayes
Classifier, Data Set is explained in Section-III;
Implementation is presented in Section – IV, Section-V
discuss the results and the last section presents the
conclusions.