Available online at www.sciencedirect.com
Sensors and Actuators B 129 (2008) 643–651
Smell similarity on the basis of gas sensor array measurements
K. Brudzewski
a,∗
, S. Osowski
a,b
, K. Wolinska
a
, J. Ulaczyk
a
a
Warsaw University of Technology, Warsaw, Poland
b
Military University of Technology, Warsaw, Poland
Received 23 May 2007; received in revised form 3 September 2007; accepted 4 September 2007
Available online 21 September 2007
Abstract
This paper discusses the problem of assessment of the similarity of smells on the basis of the gas sensor signals applied in an electronic nose
system. We have compared the measures of similarity based on the geometrical description, the information theoretic approach and the statistical
Kolmogorov–Smirnov test. Our main task was to develop the measures compatible with the human feeling of smells and a wide margin between
the similar and dissimilar smells. The results concerning recognition of the similar and dissimilar smells presented and discussed in the paper
suggest the Kolmogorov–Smirnov measure is most compatible with the human reception of smells and provides the widest margin.
© 2007 Elsevier B.V. All rights reserved.
Keywords: Electronic noise; Smell similarity; Sensor array
1. Introduction
The question of similarity of smell is an important research
subject in the computer recognition of aroma by an electronic
nose system [1–7], because it is a very practical problem, of
great applicability in cosmetic and food industry, especially for
tracing the process of aging of the products. In spite of its great
usefulness no totally acceptable measures of similarity of aroma
have been defined yet. The most often used measure is based
on a distance between patterns of sensor signals representing
the aroma in the feature space. Proximity of two objects means
that they are similar in the affinity sense. However the impor-
tant problem is the resolution. Applying the normalized scale of
similarity measure (the range 0–1) with 1 denoting the highest
possible similarity (two same smells), it is desired to establish
the measure taking all values in this range and not limited to the
small sub-range of it.
The aim of this paper is to study the subject of the determi-
nation of the similarity measures and compare them with the
human feeling. The results will show that human and algorith-
mic similarity measures vary substantially in nature, but could
be grouped into a cohesive way. The similarity measures con-
sidered in the paper can be separated into three main groups.
∗
Corresponding author.
E-mail address: brudz@ch.pw.edu.pl (K. Brudzewski).
The first one is based on the geometrical approach. The prob-
lem of similarity of two smells represented by the sensor signals
organized in the vector form can be defined as similarity between
the vectors of data. The geometrical similarity takes into account
the distances between the vectors of data samples belonging to
different groups of data. In the classical geometric approach,
we determine the Euclidean distances between the data samples
belonging to different groups for all possible combinations. The
mean of these distances is used in definition of similarity. The
other popular geometric method, called vector space approach
(VSA), relies on clustering the data vectors first. After clustering
all vectors belonging to the same cluster are represented by their
prototype vectors (so called centroids). The similarity between
two clusters may be measured by the cosine of the angle between
the centroids representing them or by the Euclidean distance
between centroids. The problem, how to estimate the number of
clusters, groups, dimensions, etc. is a pervasive one in a mul-
tivariate analysis. If there are no a priori theoretical reasons,
such decisions tend to remain somewhat arbitrary. In cluster
analysis and multi-dimensional scaling, decisions based upon
visual inspection of results are common. The results presented
in the paper show the weakness and limitations of both geomet-
rical measures, especially their small margins (small ranges of
similarity measure for similar and dissimilar smells).
To counteract this disadvantage, we have considered the
measures based on the statistical principles, considering the dis-
tributional similarity between groups of data samples. All sensor
0925-4005/$ – see front matter © 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.snb.2007.09.050