Artificial Intelligence in Medicine 59 (2013) 197–204
Contents lists available at ScienceDirect
Artificial Intelligence in Medicine
jou rn al hom e page: www.elsevier.com/locate/aiim
Data structure-guided development of electrocardiographic signal
characterization and classification
Adam Gacek
∗
Institute of Medical Technology and Equipment ITAM, 118 Roosevelt Street, 41-800 Zabrze, Poland
a r t i c l e i n f o
Article history:
Received 15 January 2013
Received in revised form
25 September 2013
Accepted 27 September 2013
Keywords:
Clustering algorithms
Fuzzy clustering
Cluster classification
Electrocardiographic signal classification
a b s t r a c t
Objective: The study introduces and elaborates on a certain perspective of biomedical data analysis where
data structure is revealed through fuzzy clustering. The key objective of the study is to develop a char-
acterization of the content of the clusters by offering a number of their descriptors established on the
basis of membership grades of patterns included there, as well as on the basis of their class membership.
Next, a design of a cluster-based classifier is presented in which the structure of the classifier is based on
a collection of clusters. The structure also exploits the descriptors of the clusters as well as aggregates
their characteristics with the activation levels of the associated clusters formed in the feature space in
which QRS complexes are represented.
Methods and materials: The underlying methods involve the use of fuzzy clustering and two essential ways
of representing QRS complexes with the use of the Hermite expansion of signals and piecewise aggre-
gate approximation (PAA). The material involves QRS segments coming from the MIT-BIH Arrhythmia
Database.
Results: The key results demonstrate and quantify the effectiveness of QRS characterization with the use
of clustering realized in the space of coefficients of the Hermite series expansion and the PAA expansion.
In general, accuracy of the discussed classification schemes increases with the increase of the number of
clusters; the difference varies in the range of 30% (when moving from 10 to 60 clusters). The fuzzification
coefficient of the fuzzy C-Means clustering algorithm has a visible impact on the quality of the results in
the range of up 40% difference in the classification of accuracy (when the coefficient varies in-between
1.1 and 2.5). The PAA representation space leads to slightly better results than those obtained when using
the Hermite representation of the signals, the difference is of around 5%.
Conclusions: It was shown that granular representation of electrocardiographic signals is essential to
data analysis and classification by providing a means to reveal and characterize the data structure and by
providing prerequisites to construct pattern classifiers. The study also shows that fuzzy clusters deliver
important structural information about the data that could be further quantified by looking into the
content of clusters.
© 2013 Elsevier B.V. All rights reserved.
1. Introduction
In pattern classification problems, we encounter a large num-
ber of algorithms of unsupervised and supervised learning [1].
Classifiers demonstrate significant geometric diversity, ranging
from linear mappings between a feature space and class assign-
ment (linear classifiers) to highly nonlinear transformations such
as those realized by means of neural networks or support vector
machines. Predominantly, classifiers are constructed in a super-
vised mode, which means that there are sets of labeled patterns
guiding the construction of classification mappings. Several inter-
esting developments can be seen in electrocardiographic (ECG)
∗
Tel.: +48 32 271 60 13.
E-mail address: adam.gacek@itam.zabrze.pl
signal description, analysis and classification where recent tech-
nologies of pattern recognition and machine learning are involved,
see [2–4].
An interesting alternative is to develop the structure of a clas-
sifier by taking into consideration the geometry of data existing in
the feature space and identified during clustering of patterns (viz.
revealing a structure in the feature space in the form of a collection
of clusters). The advantage of this approach is that clustering (being
unsupervised by nature) considers all patterns in a global manner
and looks for some general structure. As a result, we obtain over-
all geometry of data and associated classes that are less affected
by individual patterns, especially those that are misclassified. In
this way, we reduce the problems that are typical of supervised
learning when the design of classifiers with highly nonlinear char-
acteristics becomes quite sensitive to the existence of possible
outliers.
0933-3657/$ – see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.artmed.2013.09.004