IFAC PapersOnLine 50-1 (2017) 11005–11010
ScienceDirect ScienceDirect
Available online at www.sciencedirect.com
2405-8963 © 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
Peer review under responsibility of International Federation of Automatic Control.
10.1016/j.ifacol.2017.08.2479
© 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
Keywords: Time series; Similarity measure; Euclidean distance; Discrete Wavelet Transform; Discrete
Fourier Transform; Correlation Coefficient; Mahalanobis distance; Minkowski Distance; Dynamic Time
Warping Distance;
1. INTRODUCTION
In the last decade there has been intense and significant
research on developing and deploying Personal Health Care
services in cardiovascular diseases (CVD) management.
However there are still several gaps to be tackled before an
automated system can efficiently perform CVD personalized
management. Within this context, usage of intelligent
algorithms to process data obtained from uncontrolled
conditions and to be self-adapting (moving from population-
based to patient-specific adaptations) and accurate is still a
research challenge. To achieve so, several strategies may be
followed, one of them being composed of a prior identification
of the personal cardiac signal with a CVD pathology (e.g. by
identification of similarity between the personal signal with a
reference signal) followed by the automatic classification (i.e.
clustering) of the signal under analysis into a specific class of
signals (usually disease related). Like in many other
application areas the collected cardiac signals may be regarded
as long time series i.e. as the simplest representation of
temporal data expressing the changes of real values at time or
space points, due to sampling at a fixed time interval (Koh et
al (2005)).
Time series similarity measurement is a method of measuring
the degree of similarity between two-time series. When
dealing with physiological data, signals are never equal
although they might be similar where the degree of similarity
may indicate if they are or not representative of the same health
condition. If we can work with a similarity method capable of,
besides small signals’ variation, produce an effective method
of finding the relationship among the time series, it will greatly
increase precision of the analysis in time series databases,
helping to improve accuracy in classification, prediction and
clustering (Jiang et al (2009) Kalpakis et al (2001)).
Several works have been published about similarity
measuring methods. Application of similarity matching
algorithm is commonly encountered in various multimedia,
Abstract: Searching for similarity between time series plays an important role when large amounts of
information need to be clustered to integrate intelligent supported personal health care diagnosis systems.
The performance of classification, clustering and disease prediction are influenced by the prior stage where
similarity between time series is performed. Physiologic signals vary even within the same patient, so an
analysis of their possible variation without affecting future clustering accuracy is hereby addressed.
Commonly employed methods of measuring similarity between time series were tested on longer data
segments than the typical cardiac cycle envisaging its use integrated on personalized health care
cardiovascular diagnosis systems. Euclidean distance, Discrete Wavelet Transform, Discrete Fourier
Transform, Correlation Coefficient, Mahalanobis distance, Minkowski Distance, and Dynamic Time
Warping Distance were compared when 20 levels of small variations in amplitude scaling and shift, time
scaling and shift, baseline variance and additive Gaussian noise are forced to the tested time series.
Concentrating on the performance of the similarity methods in terms of their insensibility to small data
variations results demonstrate that the time domain Correlation Coefficient is the most robust method while
the Discrete Wavelet Transform is the elected one between the transform-based methods tested. Selection
of a similarity method to be applied should also take into account implementation issues, namely need of
data reduction to avoid computational burden, and in this case transform-based methods should be elected.
* FCT, University of Algarve, Faro, and, CISUC-University of Coimbra, Portugal (e-mail: adell.kiani@gmail.com).
** FCT, University of Algarve, Faro, and, CISUC-University of Coimbra, Portugal (e-mail: mruano@ualg.pt).
*** CISUC-University of Coimbra, Portugal (e-mail: carvalho@dei.uc.pt).
**** CISUC-University of Coimbra, Portugal (e-mail: jh@dei.uc.pt).
***** CISUC-University of Coimbra, Portugal (e-mail: teresa@isec.pt).
****** CISUC-University of Coimbra, Portugal (e-mail: sparedes@isec.pt)
******* FCT, University of Algarve, Faro, and, IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Lisboa,
Portugal (e-mail: aruano@ualg.pt)}
A. Kianimajd *. M. G. Ruano. **, P. Carvalho. ***,
J. Henriques. ****, T. Rocha*****, S. Paredes. ******, A. E. Ruano*******
Comparison of different methods of measuring similarity in physiologic time
series