International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-8, Issue-1, May 2019 1756 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Retrieval Number A1851058119/19©BEIESP Abstract: The quest for an optimal prediction model is still a hot topic in the field of data mining and machine learning. An optimal model is achieved when the algorithm used posses the highest performance rating based on the evaluation matrix the researchers sought to satisfy. Through this study, a hybrid modified genetic algorithm-based prediction was modeled along with the selected data mining algorithms namely the K-Nearest Neighbor, Naive Bayes, C4.5, and Rule Base algorithms such as DT, JRip, OneR, and PART. The crossover operator of the genetic algorithm was also modified to optimize the minimization process of the variables before prediction. The simulation results showed that the MGA-KNN outperformed the MGA-NB, MGA-C4.5 and MGA-RB with DT, JRip, OneR and PART algorithms with the prediction accuracy of 94%, 86%, 89%, 85%, 92%, 75%, and 92%, respectively. Index Terms: Hybrid prediction model, Modified genetic algorithm, IBAX operator, Prediction accuracy enhancement I. INTRODUCTION Data Mining (DM), otherwise known as Knowledge Discovery in Databases (KDD), is one of the quickest developing fields due to the tremendous need for added value from large-scale databases. It is defined as the extraction of information from expansive databases to discover essential and valuable data [1]. According to [2], the objective of a data mining is either to create a descriptive model or a predictive model. A descriptive model exhibits the data in a succinct form which is basically a summary of the data points, discover patterns in the data and link the connections between attributes represented by the data. Some of the tasks under the descriptive model include association rules, clustering, summarizations, and sequence discovery. Meanwhile, the predictive model works by predicting future values of the data, which utilizes known results found from various previous datasets. The predictive data mining model includes classification, prediction, regression, and analysis of time series. Prediction [3] is one of the renowned data mining approach that is commonly used in educational data mining (EDM) [4]-[6], crime mining [7], [8], business and finance [9], [10], health [11], [12], and more. The literature in forecasting and prediction is extensive. Various models were developed and utilized in response to the problems the researchers sought to answer. According to Revised Manuscript Received on May 22, 2019. Allemar Jhone P. Delima, College of Engineering and Information Technology, Surigao State College of Technology, Surigao City, Philippines. [3], there are two general categories for forecasting and prediction namely the classical and modern methods. Classical methods include econometrics-based approaches, statistical inferences, and traditional mathematical programming while the modern method employs soft computing algorithms and artificial intelligence. For example, the study of [13] proposed a forecasting approach that combines the strengths of the neural network and multivariate time series models. In the proposed approach, forecasting the exchange rate of UK, USA, and Japan was done first by time series, and then GRNN was used to correct the forecasting errors. On the other hand, [14] examined the forecasting accuracy of the exchange rate in Brazil using different approaches. They employed intelligent systems like multilayer perceptron and radial basis function neural networks and the Takagi–Sugeno fuzzy system versus the traditional methods of forecasting such as autoregressive moving average (ARMA) and ARMA-generalized autoregressive conditional heteroscedasticity (ARMA-GARCH) linear models. It was found out that the intelligent-based methods provided more accurate results than the traditional ones. Recently, [15] developed a prediction model for OTOP’s products using K-Nearest Neighbor (KNN) in a 5, 10, 15, 20, and 25 k-fold cross-validations and K-NN with k value assigned with 3, 5, and 7. The model with 5 folds cross-validation and K-NN with k=3 yields the best prediction with 87.73% accuracy. Moreover, the authors suggested to compare the reliability of the results using Naïve Bayes, C4.5 and Rule base algorithms in order to search for the optimal model for prediction, hence, this study. The proposed prediction models incorporated the modified genetic algorithm (MGA) with its new crossover mating scheme to the NB, C4.5, and RB algorithms. The MGA-KNN, MGA-NB, MGA-C4.5, and MGA-RB prediction models were compared and evaluated to search for the optimal model for prediction. For real encoding problems using the arithmetic function, the average crossover (AX) [16] of the genetic algorithm was modified in this study. An Experimental Comparison of Hybrid Modified Genetic Algorithm-based Prediction Models Allemar Jhone P. Delima