Class modeling techniques in the control of the geographical origin of wines
Michele Forina
a,
⁎, Paolo Oliveri
a
, Henry Jäger
b
, Ute Römisch
b
, Johanna Smeyers-Verbeke
c
a
Dipartimento di Chimica e Tecnologie Farmaceutiche ed Alimentari, Via Brigata Salerno 13, I-16147, Genova, Italy
b
Technische Universität Berlin, Fak. III, Gustav-Meyer-Allee 25, D-13355 Berlin, Germany
c
Vrije Universiteit Brussel, Farmaceutisch Instituut , Laarbeeklaan 103, B-1090 Brussels, Belgium
abstract article info
Article history:
Received 18 June 2009
Received in revised form 17 August 2009
Accepted 21 August 2009
Available online 29 August 2009
Keywords:
Class modeling
SIMCA
UNEQ
MRM
Wine
Specificity
Sensitivity
Wine samples of four different countries: Hungary, Czech Republic, Romania and South Africa, have been studied
within the European project WINES-DB “establishing of a wine data bank for analytical parameters from third
countries”. For each country two types of wine samples were collected, during three consecutive years: com-
mercial wines and wines obtained by microvinification according to EC regulation N. 2729/2000. The sampling
design was organized to represent both the grape varieties and the official wine regions in the four countries. The
1188 wine samples were analyzed for 58 chemical quantities.
Data analysis was performed with special attention to the real problem, namely the control of frauds.
Class modeling techniques (UNEQ, SIMCA, MRM) have been applied, to answer to the general question: “Does
sample O, stated of class A, really belong to class A?”. Two validation strategies, based on cross validation and on
an external, representative, evaluation set, have been used to evaluate carefully the predictive performance of the
class models.
The results obtained with the four class modeling techniques indicate that for the four countries it is possible to
compute models with high efficiency, generally with a reduced number of variables. To obtain efficient models,
red and white wines, commercial and microvinification wines, must be considered separately.
The validity of the models is ensured by the representativity of the samples, the appropriate application of
techniques of Chemometrics and the validation.
© 2009 Elsevier B.V. All rights reserved.
1. Introduction
This study was performed within the European project WINE-DB,
“Establishing of a wine data bank for analytical parameters for wines
from third countries”. The European Office for Wine, Alcohol and Spirit
Drinks aims to ensure correct implementation of EU wine quality legis-
lation and was set up to combat frauds in this area. To reach its ob-
jectives, the Office created the European Wine Databank of authentic
European wines, where every year some analytical data (stable iso-
topes) of more than a thousand authentic samples are added. Because
of the enlargement of the European Union and the need to extract
useful information from the analytical wine data, the project WINE-DB
was planned with the objective of: a) collecting analytical data from
countries not represented in the European Wine Databank; b) eval-
uating the utility of the analytical data (other than stable isotopes) in
the control of the geographical origin; c) comparing the composition of
authentic and commercial wines. The “authentic” wines are those ob-
tained by microvinification according to EC regulation N. 2729/2000.
The “commercial” wines studied in the project were collected by experts
of national wine control organizations, and both their geographical
origin and the variety of grapes used are sure. Moreover the commercial
wines were of high quality, from well known producers. The commercial
samples include all anthropogenic influences, so that the differences are
in the vinification procedure and in the storage time and conditions.
Control of the geographical origin means that a mathematical model,
built with the measured variables, must be used to verify the trueness of
the origin declared on the bottle. The chemometric techniques that built
such models are the class modeling techniques (CMT). They answer the
question: “Does sample O, stated of class A, really belong to class A?”. On
the contrary, the classification techniques assign objects to one of the
classes specified in the problem. The difference is not trivial, because, in
practice, classification means to assign an origin to a sample without
label, what is rather rare. However classification techniques, especially
linear discriminant analysis (LDA), are very frequently used in studies
on wines and other typical foods, both because the related software can
be found easily and because the many examples available with infor-
mative plots made LDA very popular in food science. On the other hand,
class modeling techniques have been used rarely, and with more at-
tention to their classification ability than to their modeling perfor-
mances. However, some recent papers [1–4] indicate an increased
attention to the development and improvement of class modeling
techniques and their use in food control problems.
Chemometrics and Intelligent Laboratory Systems 99 (2009) 127–137
⁎ Corresponding author. Tel.: +39 010 3532630; fax: +39 010 3532684.
E-mail address: forina@dicfta.unige.it (M. Forina).
0169-7439/$ – see front matter © 2009 Elsevier B.V. All rights reserved.
doi:10.1016/j.chemolab.2009.08.002
Contents lists available at ScienceDirect
Chemometrics and Intelligent Laboratory Systems
journal homepage: www.elsevier.com/locate/chemolab