Dhinaharan Nagamalai et al. (Eds) : SAI, NCO, SOFT, ICAITA, CDKP, CMC, SIGNAL - 2019 pp. 83-99, 2019. © CS & IT-CSCP 2019 DOI: 10.5121/csit.2019.90707 FACTORS AFFECTING CLASSIFICATION ALGORITHMS RECOMMENDATION: A SURVEY Mariam Moustafa Reda 1 , Dr Mohammad Nassef 2 and Dr Akram Salah 3 1,2,3 Computer Science Department, Faculty of Computers and Information, Cairo University, Giza, Egypt ABSTRACT A lot of classification algorithms are available in the area of data mining for solving the same kind of problem with a little guidance for recommending the most appropriate algorithm to use which gives best results for the dataset at hand. As a way of optimizing the chances of recommending the most appropriate classification algorithm for a dataset, this paper focuses on the different factors considered by data miners and researchers in different studies when selecting the classification algorithms that will yield desired knowledge for the dataset at hand. The paper divided the factors affecting classification algorithms recommendation into business and technical factors. The technical factors proposed are measurable and can be exploited by recommendation software tools. KEYWORDS Classification, Algorithm selection, Factors, Meta-learning, Landmarking 1. INTRODUCTION There is a lot of raw data stored in business organizations databases, and with the progressively competitive markets and computers capabilities, businesses find themselves faced with the massive amount of data stored and the need to identify patterns, correlations, and predictive information that business experts may miss. Data mining is the field that helps business experts make better decisions based on the discovered patterns and relationships in the data available. One key data mining task is classification, where it addresses the problem of assigning the unit of analysis of a dataset to target classes to help in more accurate predictions. There are different categories of classification algorithms. But, any classification algorithm needs one or more fields to be used as predictors, and a target field to predict. To stay on track in a data mining project, a standard methodology or a list of best practices has to be followed. Efforts were made to use a standard data mining methodology that will guide the implementation of different data mining tasks, [1]. The most popular methodologies followed by researchers are CRISP-DM: Cross-industry standard process for data mining and SEMMA: Sample, Explore, Modify, Model, and Assess. CRISP-DM was founded by the European Strategic Program on Research in Information Technology, while SEMMA was developed by SAS Institute. Both of these methodologies have well-defined phases for modelling the data by an algorithm and evaluating the model after being created. Also, the first methodology; KDD: Knowledge Discovery in Database was adopted for years by data scientists. During modelling, there are several algorithms that could be used to perform the same data mining task and still produce different results. For example, to address a classification problem, one may choose from many algorithms, neural nets, where it has a lot of variants and considered as a black box model, another option is C5.0 and CHAID, which are considered as decision tree algorithms, last but not