IFAC PapersOnLine 50-1 (2017) 11005–11010 ScienceDirect ScienceDirect Available online at www.sciencedirect.com 2405-8963 © 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Peer review under responsibility of International Federation of Automatic Control. 10.1016/j.ifacol.2017.08.2479 © 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Keywords: Time series; Similarity measure; Euclidean distance; Discrete Wavelet Transform; Discrete Fourier Transform; Correlation Coefficient; Mahalanobis distance; Minkowski Distance; Dynamic Time Warping Distance; 1. INTRODUCTION In the last decade there has been intense and significant research on developing and deploying Personal Health Care services in cardiovascular diseases (CVD) management. However there are still several gaps to be tackled before an automated system can efficiently perform CVD personalized management. Within this context, usage of intelligent algorithms to process data obtained from uncontrolled conditions and to be self-adapting (moving from population- based to patient-specific adaptations) and accurate is still a research challenge. To achieve so, several strategies may be followed, one of them being composed of a prior identification of the personal cardiac signal with a CVD pathology (e.g. by identification of similarity between the personal signal with a reference signal) followed by the automatic classification (i.e. clustering) of the signal under analysis into a specific class of signals (usually disease related). Like in many other application areas the collected cardiac signals may be regarded as long time series i.e. as the simplest representation of temporal data expressing the changes of real values at time or space points, due to sampling at a fixed time interval (Koh et al (2005)). Time series similarity measurement is a method of measuring the degree of similarity between two-time series. When dealing with physiological data, signals are never equal although they might be similar where the degree of similarity may indicate if they are or not representative of the same health condition. If we can work with a similarity method capable of, besides small signals’ variation, produce an effective method of finding the relationship among the time series, it will greatly increase precision of the analysis in time series databases, helping to improve accuracy in classification, prediction and clustering (Jiang et al (2009) Kalpakis et al (2001)). Several works have been published about similarity measuring methods. Application of similarity matching algorithm is commonly encountered in various multimedia, Abstract: Searching for similarity between time series plays an important role when large amounts of information need to be clustered to integrate intelligent supported personal health care diagnosis systems. The performance of classification, clustering and disease prediction are influenced by the prior stage where similarity between time series is performed. Physiologic signals vary even within the same patient, so an analysis of their possible variation without affecting future clustering accuracy is hereby addressed. Commonly employed methods of measuring similarity between time series were tested on longer data segments than the typical cardiac cycle envisaging its use integrated on personalized health care cardiovascular diagnosis systems. Euclidean distance, Discrete Wavelet Transform, Discrete Fourier Transform, Correlation Coefficient, Mahalanobis distance, Minkowski Distance, and Dynamic Time Warping Distance were compared when 20 levels of small variations in amplitude scaling and shift, time scaling and shift, baseline variance and additive Gaussian noise are forced to the tested time series. Concentrating on the performance of the similarity methods in terms of their insensibility to small data variations results demonstrate that the time domain Correlation Coefficient is the most robust method while the Discrete Wavelet Transform is the elected one between the transform-based methods tested. Selection of a similarity method to be applied should also take into account implementation issues, namely need of data reduction to avoid computational burden, and in this case transform-based methods should be elected. * FCT, University of Algarve, Faro, and, CISUC-University of Coimbra, Portugal (e-mail: adell.kiani@gmail.com). ** FCT, University of Algarve, Faro, and, CISUC-University of Coimbra, Portugal (e-mail: mruano@ualg.pt). *** CISUC-University of Coimbra, Portugal (e-mail: carvalho@dei.uc.pt). **** CISUC-University of Coimbra, Portugal (e-mail: jh@dei.uc.pt). ***** CISUC-University of Coimbra, Portugal (e-mail: teresa@isec.pt). ****** CISUC-University of Coimbra, Portugal (e-mail: sparedes@isec.pt) ******* FCT, University of Algarve, Faro, and, IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal (e-mail: aruano@ualg.pt)} A. Kianimajd *. M. G. Ruano. **, P. Carvalho. ***, J. Henriques. ****, T. Rocha*****, S. Paredes. ******, A. E. Ruano******* Comparison of different methods of measuring similarity in physiologic time series