International Journal of Computer Applications (0975 8887) Volume 181 No. 42, February 2019 16 Application of Data Mining Tools for Identifying Determinant Factors for Crop Productivity Assefa Chekole Department of Information Science University of Gondar, Ethiopia Tibebe Beshah, PhD School of Information Science Addis Ababa University, Ethiopia ABSTRACT Agriculture is the backbone of the Ethiopian economy and it contributes the highest GDP of the country. Among this, crop production takes the highest level of income for most smallholder farmers in all regions of Ethiopia. The objective of this research is to build a model that can predict crops productivity and implement a decision support system. In order to conduct this research, a hybrid Knowledge Discovery Process model was adopted. For the purpose of this research, the datasets were taken from Central Statistical Agency of Ethiopia database, and the researcher used a total of 25,000 instances for training and building a model. Hence, for building a model and implementing decision support system for predicting crop productivity, WEKA data mining tool and java NetBeansIDE was used respectively. To achieve the objective of these research different experiments were conducted using J48, HoeffdingTree decision tree and PART rule based classifiers. In addition, the predictive performances of the classifiers are evaluated and compared using accuracy rate, confusion matrix and ROC curve. Based on this, out of the three classifiers PART rule based classifier performs best accuracy and ROC rate which is 95.44 % and 0.992 respectively. As a result PART rule based classifier were selected for implementing the model to predict crop productivity. In this thesis, the experimental result shows that, the main determinant factors for crop productivity are main season (season type), use of extension program, fertilizer used and fertilizer type. Therefore, the outcome of this research is essential to make data mining based decisions for policy makers and for experts in the area of crop agriculture to give an attention on the factors affecting crop productivity and to take corrective measures. Keywords Data mining, predictive model, decision support system, crop production, Ethiopia 1. INTRODUCTION Data Mining (DM) is the process of analyzing data from different perspectives and summarizing it into useful information [1, 2]. There are different DM algorithms exist, including the predictive Data Mining algorithms, which result in classifiers that can be used for prediction and classification, and descriptive data mining algorithm that serve other purposes like finding of associations and clusters [1, 2]. Data mining application has been recently gained much attention of every application fields like industry, economics, medicine, CRM, trade, etc, due to the existence of large collections of data in different formats, and the increasing need of data analysis and comprehension [1, 2]. Since data mining is the most important tool to discovery knowledge from large database. It is a process of semi- automatically analyzing large databases to find valid, novel, useful and understandable patterns [1, 2]. In addition, Data mining has paid attention to modeling as much as preprocessing and cleaning data to gain best results [3, 4]. Since Agriculture is the backbone of the Ethiopian economy, As such in the context of Ethiopia crops are cultivated between two cropping seasons i.e. during belg and meher. Based on the researcher preliminary discussion with experts currently the productivity of crop prediction has been done using farmers past experience, through field observations and the production output also predicted using statistically estimation of the crops with field observation during pre and post harvesting. In addition, the statistical prediction of crops production is not sufficient to predict the determinant factors for crops productivity. Nowadays, crop Productivity prediction is essential to identify the cause for low or high productivity factors and used to enhancing the productivity and production of smallholder farmers mainly by reducing the traditional ways of estimating productivity. As a result, it used to strengthen the implementation of effective cropping strategies for national development program and it has been benefited to make data mining based decision making system for decision makers and experts. Crop agriculture in Ethiopia continues to be dominated by the country’s numerous smallholder farms that cultivate mainly cereal crops for both own-consumption and sales [5]. The major cereal crops which are mostly harvested by smallholder farmers are Teff, wheat, maize, sorghum, and barley. Decision support systems (DSS) is an interactive computer- based systems intended to help decision makers utilize data and models in order to identify problems, solve problems and make decisions [6]. They incorporate both data and models and they are designed to assist decision makers in semi- structured and unstructured decision making processes. Also they provide support for decision making, they do not replace it. The goal of decision support systems is to improve effectiveness, rather than the efficiency of decisions [6]. The use of data mining to facilitate decision support can lead to an improved performance of decision making and can enable the tackling of new types of problems that have not been addressed before [7]. The integration of data mining and decision support can significantly improve current approaches and create new approaches to problem solving, by enabling the fusion of knowledge from experts and knowledge extracted from data [7]. In order to conduct this research, the researcher used crop production sample survey datasets acquired from central statistical agency (CSA) of Ethiopia. For that purpose, predictive data mining approach were employed. Therefore, this study were addressed the following specific objectives: