Application of a Hybrid Classifier to the Recognition of Petrochemical Odors E. M. J. Oliveira, P. G. Campos, T. B. Ludermir, F. A. T. de Carvalho, W. R. de Oliveira Center of Informatics, Federal University of Pernambuco, CP. 7851, 50740-540, PE, Brazil E-mail: {emjo, pgc, tbl, fatc}@cin.ufpe.br, wilson.rosa@gmail.com Abstract Nowadays there are several data mining algorithms applied to the resolution of many different problems, such as the classification of patterns. However, when these algorithms are used separately to classify they usually present an inferior performance compared to the performance obtained by combined models. The Bagging and Boosting techniques combine models of the same kind in a competitive form, in other words, the output is generally provided by the winning classifier. Alternatively, Stacking usually combines different algorithms, constituting a hybrid model. Nevertheless, stacking has a high cost, due to the search for the best models that will be combined to solve a certain problem. Thus, we present a Hybrid Classifier (HC) to be applied to the recognition of gases derived from petrol at a lower cost and in a cooperative way. 1. Introduction In the literature there are several applications of data mining methods to many problems, such as the prediction of time series, association, clustering and classification of patterns [2][3][8][9][11][12]. In classification problem combined models generally present better performance than a single classifier [2]. Thus, it is justified, in practice, the importance of combining models. The Bagging and the Boosting techniques [3][8][9] present a competitive combination of models of the same kind. The response of the classifier constructed is provided through voting or through the average of outputs of the individual models. Another alternative is indicated by the Stacking technique [2][12], where different models are combined. In this way, a hybrid model can be obtained. However, instead of deciding for the most voted, this combined model finds out which is the most adequate algorithm for the problem to which it has been submitted. Note that in this way, the cost of the acquisition is high in the combined model. The models built adequately by using Stacking tend to generate better results than Bagging and Boosting, but at a high cost. Aiming at obtaining a classifier that may provide better performance at a lower cost in its development, we presented a Hybrid Classifier (HC) to identify gases derived from petrol. The petrochemical composts are ethane, methane, butane, carbon monoxide (CO) and propane, which have the peculiarity that human beings do not feel their smell and the disadvantage that they may lead people to death. This work is divided into six sections. Sections 2 and 3 describe the HC. Section 4 presents details of the data used in the experiments. Section 5 concentrates details of accomplished experiments. Section 6 contains a summary and conclusions of this paper. 2. Hybrid Classifier As we have already mentioned in the introduction, combined models normally present better results than an isolated model. In this context, three techniques stand out: Bagging, Boosting and Stacking [2][3][8][9][12]. Combining distinct models, by using Stacking generally leads to superior results compared to the ones obtained by the competitive junction of the same kind of models, by using Bagging or Boosting. However, the construction of hybrid models with Stacking is much more expensive, since it tries to find out which of the models is more adequate for the solution of a certain problem. Therefore, we will present in this section an algorithm for the construction of a cooperative hybrid classifier which is less expensive in computational effort, keeping a superior classification performance compared to the isolated models applied to the resolution of the same problem. As it can be seen ahead, the principle to obtain the HC is the “divide-and-conquest” one. The idea is to evaluate the database in question concerning the existence of at least a linearly separable class. In this Seventh International Conference on Hybrid Intelligent Systems 0-7695-2946-1/07 $25.00 © 2007 IEEE DOI 10.1109/HIS.2007.21 78