Fusion Engineering and Design 85 (2010) 423–424
Contents lists available at ScienceDirect
Fusion Engineering and Design
journal homepage: www.elsevier.com/locate/fusengdes
Empirically derived basis functions for unsupervised classification of radial
profile data
D.G. Pretty
a,∗
, J. Vega
b
, M.A. Ochando
b
, F.L. Tabarés
b
a
Plasma Physics Laboratory, Research School of Physics and Engineering, Australian National University, Canberra ACT 0200, Australia
b
Asociatión EURATOM/CIEMAT para Fusión, Avda Complutense 22, 28040 Madrid, Spain
article info
Article history:
Available online 16 February 2010
Keywords:
Profile classification
SVD
Support vector machine
abstract
We present an analysis of empirically derived basis vectors for feature detection in radial profile data.
Our aim is to classify broad and peaked profiles using unsupervised techniques. Radial data often con-
tains a continuum of profile shapes from broad to peaked, as such clustering methods may be unreliable.
Previously, ad hoc heuristic measures had been used for classification of profiles from raw data (without
tomographic reconstruction), which required significant manual inspection of the data. Here, we apply a
singular value decomposition (SVD) to a training data matrix consisting of a concatenation of multichan-
nel bolometry time series data from 103 TJ-II plasma discharges with good representation of the range
of profiles. The second largest spatial basis vector (topo) has radial roots either side of the plasma centre,
and can intuitively be interpreted as a peakedness perturbation. The inverted topo matrix can be used
to process new data for automated profile classification. Finally, we show an application of this method
using support vector machines to locate other signals related to the radiation profile.
© 2010 Elsevier B.V. All rights reserved.
1. Empirical spatial basis vectors
Distinct “bell” and “dome” shaped profiles are observed in TJ-II
bolometry (CBOL) profile data. In order to allow rapid or real-
time classification of the profiles, it is desirable to use raw data,
and tomographically reconstruct data only for specific shots of
interest. Previously, an ad hoc comparison of outer bolometry chan-
nels
b
= CBOL13/CBOL14 was used to detect changes in profile
shape from raw data. A transition of
b
from constant low value
(1
b
4) to higher values with increased fluctuation was found
to correspond with a bell to dome transition. However, as the
b
parameterisation is not suitable for quantitative or unsupervised
use we instead use basis vectors, empirically derived using a singu-
lar value decomposition (SVD) of a data matrix selected to describe
the profile features we wish to classify.
The SVD is defined as S = UAV
∗
[1] where, in the context of this
paper, the rows of S are the separate bolometry timeseries channels,
the columns of U and V contain the spatial (topo) and temporal
(chrono) orthonormal singular vectors respectively, V
∗
denotes the
conjugate transpose of V , and the diagonal matrix A contains the
non-negative singular values. The SVD is closely related to principle
component analysis (PCA); if the channels means are subtracted
from S, the topos U correspond to the principle components of SS
T
.
∗
Corresponding author at: Plasma Physics Laboratory, Research School of Physical
Sciences, Australian National University, Canberra, Australia. Tel.: +61 402 305 212.
E-mail address: david.pretty@anu.edu.au (D.G. Pretty).
Both SVD and PCA methods are widely used with fusion data,
e.g. for interferometry profile inversions [2] tokomak q-profile con-
trol [3] and fluctuation mode analysis [4,5]. While these methods
are generally used for feature extraction (dimensionality reduc-
tion) and noise removal, here our goal is profile classification with
respect to a predetermined feature, namely the peakedness of the
profile. As described below, when the desired feature is expressed
as a singular vector we benefit from the quantification of classifi-
cation uncertainty.
To generate the basis vectors, we construct an N
c
× N
s
data
matrix S by concatenating the N
c
= 16 bolometry channels of 103
TJ-II discharges, where N
s
= number of shots × samples per shot,
with pre- and post-shot noise removed. The 103 discharges selected
represent a wide range of typical TJ-II plasmas in which bell and
dome profiles have been observed.
The dominant topos (columns of U with largest correspond-
ing singular values) are shown in Fig. 1; the largest topo (topo 0)
gives the general radial profile, and the topo 1 is the perturbation
which defines a peaked (flattened) profile when the corresponding
chrono vector is positive (negative). Tomographic reconstruction of
timeseries generated from these two largest topos show this inter-
pretation from the raw data corresponds to the observed bell and
dome shaped profiles. The occurrence of topo 1 as a measure of
peakedness is not guaranteed by the SVD but is a consequence of
the training set S being selected from shots in which the bell–dome
modification is the profile perturbation with greatest signal energy.
The confidence in topo 1 as a reliable direct parameterisation of the
basic profile shape can be quantified as: p
n1
= a
2
1
c
1
/
∑
i=1,2,...,15
a
2
i
c
i
0920-3796/$ – see front matter © 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.fusengdes.2010.01.020