Please cite this article in press as: J. Tobin, et al., Untargeted analysis of chromatographic data for green and fermented rooibos: Problem
with size effect removal, J. Chromatogr. A (2017), https://doi.org/10.1016/j.chroma.2017.10.024
ARTICLE IN PRESS
G Model
CHROMA-358929; No. of Pages 7
Journal of Chromatography A, xxx (2017) xxx–xxx
Contents lists available at ScienceDirect
Journal of Chromatography A
journal homepage: www.elsevier.com/locate/chroma
Full length article
Untargeted analysis of chromatographic data for green and fermented
rooibos: Problem with size effect removal
Jade Tobin
a,b
, Jan Walach
c
, Dalene de Beer
a,b
, Paul J. Williams
b
, Peter Filzmoser
c
,
Beata Walczak
d,∗
a
Plant Bioactives Group, Post-Harvest and Wine Technology Division, Agricultural Research Council (ARC), Infruitec-Nietvoorbij, Private Bag X5026,
Stellenbosch, 7599, South Africa
b
Department of Food Science, Stellenbosch University, Private Bag X1, Matieland, Stellenbosch, South Africa
c
Institute of Statistics and Mathematical Methods in Economics, Vienna University of Technology, Vienna, Austria
d
University of Silesia, Institute of Chemistry, Szkolna 9, 400-006, Katowice, Poland
a r t i c l e i n f o
Article history:
Received 14 July 2017
Received in revised form 2 October 2017
Accepted 8 October 2017
Available online xxx
Keywords:
Multivariate analysis of variance
Pre-processing
Target projection
Pairwise log-ratio
Biomarkers identification
Rooibos tea fermentation
a b s t r a c t
While analyzing chromatographic data, it is necessary to preprocess it properly before exploration and/or
supervised modeling. To make chromatographic signals comparable, it is crucial to remove the scaling
effect, caused by differences in overall sample concentrations. One of the efficient methods of signal
scaling is Probabilistic Quotient Normalization (PQN) [1]. However, it can be applied only to data for
which the majority of features do not vary systematically among the studied classes of signals. When
studying the influence of the traditional “fermentation” (oxidation) process on the concentration of 56
individual peaks detected in rooibos plant material, this assumption is not fulfilled. In this case, the
only possible solution is the analysis of pairwise log-ratios, which are not influenced by the scaling
constant. To estimate significant features, i.e., peaks differentiating the studied classes of samples (green
and fermented rooibos plant material), we propose the application of rPLR (robust pair-wise log-ratios) as
proposed by Walach et al. [2]. It allows for fast computation and identification of the significant features
in terms of original variables (peaks) which is problematic, while working with the unfolded pair-wise
log ratios. As demonstrated, it can be applied to designed data sets and in the case of contaminated data,
it allows proper conclusions.
© 2017 Elsevier B.V. All rights reserved.
1. Introduction
Rooibos herbal tea, made from the indigenous South African
fynbos plant Aspalathus linearis (Burm.f.) R.Dahlgren, has gained
tremendous popularity on the global market. It is mainly pro-
duced in the “fermented” (oxidised) form with only a small amount
of green (unoxidised) herbal tea produced. The health-promoting
properties of rooibos, e.g. antioxidant, anti-cancer, antidiabetic,
hepatoprotective and anti-inflammatory activities to name a few
(reviewed by Joubert et al.; Joubert and de Beer) [3,4], are mainly
associated with its unique phenolic composition. During tradi-
tional processing of rooibos herbal tea, the fermentation step is
essential for developing the sought-after flavour and red-brown
color of the tea. However, oxidation of phenolic compounds also
occur with large reduction in especially aspalathin content (Wal-
∗
Corresponding author.
E-mail address: beata.walczak@us.edu.pl (B. Walczak).
ters et al.) [5]. The phenolic oxidation reactions occurring during
fermentation is still poorly understood. Chemometric analysis of
chromatographic data from green and fermented rooibos plant
material can provide information about which compounds are
involved in oxidative reactions during fermentation. Prior to data
analysis, chromatographic fingerprints have to be preprocessed to
eliminate all undesired signal components, such as baseline and
noise, and properly aligned to the selected target. Additionally,
to make them comparable, it is necessary to normalize them (in
order to remove the ‘size effect’), and to transform the studied fea-
tures to stabilize the data variance. All these steps determine proper
identification of the changing concentration of the plant material
components during the fermentation process. In our previous study
[6] limited to 16 known standards only, it was proved that fermen-
tation process has a statistically significant influence on the extract
composition. However, when working with the standards, it was
possible to estimate their concentrations in the studied extracts.
When working with the entire fingerprints (untargeted analysis),
we have to work with the peak areas instead of concentrations,
https://doi.org/10.1016/j.chroma.2017.10.024
0021-9673/© 2017 Elsevier B.V. All rights reserved.