Application of a Hybrid Classifier to the Recognition of Petrochemical Odors
E. M. J. Oliveira, P. G. Campos, T. B. Ludermir, F. A. T. de Carvalho, W. R. de Oliveira
Center of Informatics, Federal University of Pernambuco, CP. 7851, 50740-540, PE, Brazil
E-mail: {emjo, pgc, tbl, fatc}@cin.ufpe.br, wilson.rosa@gmail.com
Abstract
Nowadays there are several data mining algorithms
applied to the resolution of many different problems,
such as the classification of patterns. However, when
these algorithms are used separately to classify they
usually present an inferior performance compared to
the performance obtained by combined models. The
Bagging and Boosting techniques combine models of
the same kind in a competitive form, in other words,
the output is generally provided by the winning
classifier. Alternatively, Stacking usually combines
different algorithms, constituting a hybrid model.
Nevertheless, stacking has a high cost, due to the
search for the best models that will be combined to
solve a certain problem. Thus, we present a Hybrid
Classifier (HC) to be applied to the recognition of
gases derived from petrol at a lower cost and in a
cooperative way.
1. Introduction
In the literature there are several applications of
data mining methods to many problems, such as the
prediction of time series, association, clustering and
classification of patterns [2][3][8][9][11][12]. In
classification problem combined models generally
present better performance than a single classifier [2].
Thus, it is justified, in practice, the importance of
combining models.
The Bagging and the Boosting techniques [3][8][9]
present a competitive combination of models of the
same kind. The response of the classifier constructed is
provided through voting or through the average of
outputs of the individual models. Another alternative is
indicated by the Stacking technique [2][12], where
different models are combined. In this way, a hybrid
model can be obtained. However, instead of deciding
for the most voted, this combined model finds out
which is the most adequate algorithm for the problem
to which it has been submitted. Note that in this way,
the cost of the acquisition is high in the combined
model.
The models built adequately by using Stacking tend
to generate better results than Bagging and Boosting,
but at a high cost. Aiming at obtaining a classifier that
may provide better performance at a lower cost in its
development, we presented a Hybrid Classifier (HC) to
identify gases derived from petrol. The petrochemical
composts are ethane, methane, butane, carbon
monoxide (CO) and propane, which have the
peculiarity that human beings do not feel their smell
and the disadvantage that they may lead people to
death.
This work is divided into six sections. Sections 2
and 3 describe the HC. Section 4 presents details of the
data used in the experiments. Section 5 concentrates
details of accomplished experiments. Section 6
contains a summary and conclusions of this paper.
2. Hybrid Classifier
As we have already mentioned in the introduction,
combined models normally present better results than
an isolated model. In this context, three techniques
stand out: Bagging, Boosting and Stacking
[2][3][8][9][12]. Combining distinct models, by using
Stacking generally leads to superior results compared
to the ones obtained by the competitive junction of the
same kind of models, by using Bagging or Boosting.
However, the construction of hybrid models with
Stacking is much more expensive, since it tries to find
out which of the models is more adequate for the
solution of a certain problem. Therefore, we will
present in this section an algorithm for the construction
of a cooperative hybrid classifier which is less
expensive in computational effort, keeping a superior
classification performance compared to the isolated
models applied to the resolution of the same problem.
As it can be seen ahead, the principle to obtain the HC
is the “divide-and-conquest” one. The idea is to
evaluate the database in question concerning the
existence of at least a linearly separable class. In this
Seventh International Conference on Hybrid Intelligent Systems
0-7695-2946-1/07 $25.00 © 2007 IEEE
DOI 10.1109/HIS.2007.21
78