International Journal of Recent Technology and Engineering (IJRTE)
ISSN: 2277-3878, Volume-8, Issue-1, May 2019
1756
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number A1851058119/19©BEIESP
Abstract: The quest for an optimal prediction model is still a
hot topic in the field of data mining and machine learning. An
optimal model is achieved when the algorithm used posses the
highest performance rating based on the evaluation matrix the
researchers sought to satisfy. Through this study, a hybrid
modified genetic algorithm-based prediction was modeled along
with the selected data mining algorithms namely the K-Nearest
Neighbor, Naive Bayes, C4.5, and Rule Base algorithms such as
DT, JRip, OneR, and PART. The crossover operator of the
genetic algorithm was also modified to optimize the
minimization process of the variables before prediction. The
simulation results showed that the MGA-KNN outperformed the
MGA-NB, MGA-C4.5 and MGA-RB with DT, JRip, OneR and
PART algorithms with the prediction accuracy of 94%, 86%,
89%, 85%, 92%, 75%, and 92%, respectively.
Index Terms: Hybrid prediction model, Modified genetic
algorithm, IBAX operator, Prediction accuracy enhancement
I. INTRODUCTION
Data Mining (DM), otherwise known as Knowledge
Discovery in Databases (KDD), is one of the quickest
developing fields due to the tremendous need for added value
from large-scale databases. It is defined as the extraction of
information from expansive databases to discover essential
and valuable data [1]. According to [2], the objective of a data
mining is either to create a descriptive model or a predictive
model. A descriptive model exhibits the data in a succinct
form which is basically a summary of the data points,
discover patterns in the data and link the connections
between attributes represented by the data. Some of the tasks
under the descriptive model include association rules,
clustering, summarizations, and sequence discovery.
Meanwhile, the predictive model works by predicting future
values of the data, which utilizes known results found from
various previous datasets. The predictive data mining model
includes classification, prediction, regression, and analysis
of time series. Prediction [3] is one of the renowned data
mining approach that is commonly used in educational data
mining (EDM) [4]-[6], crime mining [7], [8], business and
finance [9], [10], health [11], [12], and more.
The literature in forecasting and prediction is extensive.
Various models were developed and utilized in response to
the problems the researchers sought to answer. According to
Revised Manuscript Received on May 22, 2019.
Allemar Jhone P. Delima, College of Engineering and Information
Technology, Surigao State College of Technology, Surigao City, Philippines.
[3], there are two general categories for forecasting and
prediction namely the classical and modern methods.
Classical methods include econometrics-based approaches,
statistical inferences, and traditional mathematical
programming while the modern method employs soft
computing algorithms and artificial intelligence.
For example, the study of [13] proposed a forecasting
approach that combines the strengths of the neural network
and multivariate time series models. In the proposed
approach, forecasting the exchange rate of UK, USA, and
Japan was done first by time series, and then GRNN was used
to correct the forecasting errors. On the other hand, [14]
examined the forecasting accuracy of the exchange rate in
Brazil using different approaches. They employed intelligent
systems like multilayer perceptron and radial basis function
neural networks and the Takagi–Sugeno fuzzy system versus
the traditional methods of forecasting such as autoregressive
moving average (ARMA) and ARMA-generalized
autoregressive conditional heteroscedasticity
(ARMA-GARCH) linear models. It was found out that the
intelligent-based methods provided more accurate results
than the traditional ones.
Recently, [15] developed a prediction model for OTOP’s
products using K-Nearest Neighbor (KNN) in a 5, 10, 15,
20, and 25 k-fold cross-validations and K-NN with k value
assigned with 3, 5, and 7. The model with 5 folds
cross-validation and K-NN with k=3 yields the best
prediction with 87.73% accuracy. Moreover, the authors
suggested to compare the reliability of the results using Naïve
Bayes, C4.5 and Rule base algorithms in order to search for
the optimal model for prediction, hence, this study. The
proposed prediction models incorporated the modified
genetic algorithm (MGA) with its new crossover mating
scheme to the NB, C4.5, and RB algorithms. The
MGA-KNN, MGA-NB, MGA-C4.5, and MGA-RB
prediction models were compared and evaluated to search for
the optimal model for prediction.
For real encoding problems using the arithmetic function, the
average crossover (AX) [16] of the genetic algorithm was
modified in this study.
An Experimental Comparison of Hybrid
Modified Genetic Algorithm-based Prediction
Models
Allemar Jhone P. Delima