AbstractData mining and classification of objects is the process of data analysis, using various machine learning techniques, which is used today in various fields of research. This paper presents a concept of hybrid classification model improved with the expert knowledge. The hybrid model in its algorithm has integrated several machine learning techniques (Information Gain, K-means, and Case- Based Reasoning) and the expert’s knowledge into one. The knowledge of experts is used to determine the importance of features. The paper presents the model algorithm and the results of the case study in which the emphasis was put on achieving the maximum classification accuracy without reducing the number of features. KeywordsCase based reasoning, classification, expert's knowledge, hybrid model. I. INTRODUCTION ODAY, when the Internet has become inevitable part of daily life, we are faced with the problem of collecting useful information and extracting valuable knowledge from a large amount of data. Data mining and machine learning are research fields that deal with such issues. Major data mining techniques are regression, clustering and classification. In our study, we dealt with the problem of classification and we wanted to create a hybrid model usable for data classification in the various domains of the problem, a model that will enable to achieve high classification accuracy. In this paper, we present a concept of the classifier that can perform excellent classification results from large datasets. The concept of hybrid classification model shows the manner of merging modern machine learning methods and expert’s knowledge. Classification is the process of determining the origin of the object/instances of a class based on its feature values. The task of classification model is to correctly classify new object/instance. Today, for the purpose of data classification, various methods of machine learning are used, such as Neural Networks [1], Support Vector Machines [2] and Naive Bayes [3]. In order to achieve better accuracy of classification, our study focused on the development of the hybrid model Bruno Trstenjak, is with Department of Computer Engineering, University of Applied Sciences Cakovec, Croatia, B. J. Jelacica 22a, 40000 Cakovec, Croatia (phone: +385 40 396 990; fax: +385 40 396 980; e-mail: btrstenjak@mev.hr). Dzenana Donko, is with Department of Computer Science, Faculty of Electrical Engineering, Sarajevo, Bosnia and Herzegovina. (e-mail: ddonko@etf.unsa.ba). concept. The hybrid classification model consists of several methods from the field of machine learning and provides a slightly different classification approach. This hybrid model merges three machine learning techniques: Information Gain (IG), k-means and Case Based Reasoning (CBR). One of the objectives in developing a new concept of hybrid model was that in the process of classification, the model uses all the features, without any reduction in their number. For this reason, IG method was implemented in the hybrid model. IG is used for ranking the features based on data entropy and certain statistical criteria [4]. IG method calculates the value of the features information. Value is defined as the amount of information, provided by the feature items for the class. With a ranking of the features, IG method determines their importance in the process of classification. The hybrid model uses the obtained rank values for calculating the similarities between instances. For the purposes of the clustering process, K-means algorithm is used [5]. K-means is one of the simplest unsupervised learning algorithms used for solving clustering problems. Clustering is the process of dividing data into clusters, grouped on the basis of common properties. K-means algorithm is used to optimize the clustering data and preparing for the classification phase. The third method in the hybrid model is the classification method. For the purposes of classification, CBR method was used. CBR method uses the methodology of solving new problems based on the previous cases. Therefore, past experience is essential for the CBR method. All collected experiences are written in the form of cases. CBR investigates cases from the past, and on the basis of similarity, the method proposes a solution to the current situation [6]. The remainder of the paper is organized as follows; Section II gives an overview of the work related to hybrid models and expert’s knowledge. Section III deals with the concept of hybrid model and the implemented algorithm. Section IV presents achieved results of the case study and Section V concludes the paper. II. BACKGROUND The basic idea for the development of a hybrid model was to create a concept of classification based on merging experience and knowledge with the machine learning methods, to develop an algorithm that will preserve all the characteristics of objects in the process of classification and to achieve high classification accuracy without using any Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems Bruno Trstenjak, Dzenana Donko T World Academy of Science, Engineering and Technology International Journal of Computer and Information Engineering Vol:10, No:6, 2016 1184 International Scholarly and Scientific Research & Innovation 10(6) 2016 ISNI:0000000091950263 Open Science Index, Computer and Information Engineering Vol:10, No:6, 2016 publications.waset.org/10004850/pdf