Research Article A Hybrid Algorithm for Clustering of Time Series Data Based on Affinity Search Technique Saeed Aghabozorgi, Teh Ying Wah, Tutut Herawan, Hamid A. Jalab, Mohammad Amin Shaygan, and Alireza Jalali Faculty of Computer Science & Information Technology Building, University of Malaya, 50603 Kuala Lumpur, Malaysia Correspondence should be addressed to Saeed Aghabozorgi; saeed@um.edu.my Received 4 October 2013; Accepted 2 February 2014; Published 25 March 2014 Academic Editors: H. Chen, P. Ji, and Y. Zeng Copyright © 2014 Saeed Aghabozorgi et al. his is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Time series clustering is an important solution to various problems in numerous ields of research, including business, medical science, and inance. However, conventional clustering algorithms are not practical for time series data because they are essentially designed for static data. his impracticality results in poor clustering accuracy in several systems. In this paper, a new hybrid clustering algorithm is proposed based on the similarity in shape of time series data. Time series data are irst grouped as subclusters based on similarity in time. he subclusters are then merged using the k-Medoids algorithm based on similarity in shape. his model has two contributions: (1) it is more accurate than other conventional and hybrid approaches and (2) it determines the similarity in shape among time series data with a low complexity. To evaluate the accuracy of the proposed model, the model is tested extensively using syntactic and real-world time series datasets. 1. Introduction Clustering is considered the most important unsupervised learning problem. he clustering of time series data is particularly advantageous in exploratory data analysis and summary generation. Time series clustering is also a pre- processing step in either another time series mining task or as part of a complex system. Researchers have shown that using well-known conventional algorithms in the clustering of static data, such as partitional and hierarchical clustering, generates clusters with an acceptable structural quality and consistency and is partially eicient in terms of execution time and accuracy [1]. However, classic machine learning and data mining algorithms are inefective with regard to time series data because of the unique structure of time series, that is, its high dimensionality, very high feature correlation, and (typically) large amount of noise [2–4]. Accordingly, numerous research eforts have been conducted to present an eicient approach to time series clustering. However, the focus on the eiciency and scalability of these methods in handling time series data has come at the expense of losing the usability and efectiveness of clustering [5]. he clustering of time series data can be broadly clas- siied into conventional approaches and hybrid approaches. Conventional approaches employed in the clustering of time series data are typically partitioning, hierarchical, or model-based algorithms. In hierarchical clustering, a nested hierarchy of similar objects is constructed based on a pairwise distance matrix [6]. Hierarchical clustering has great visualization power in time series clustering [7]. his characteristic has made hierarchical clustering very suitable for time series clustering [8, 9]. Additionally, hierarchical clustering does not require the number of clusters as an initial parameter, in contrast to most algorithms. his characteristic is a well-known and outstanding feature of this algorithm and is a strength point in time series clustering because deining the number of clusters is oten diicult in real-world problems. However, hierarchical clustering is cumbersome when handling large time series datasets [10] because of its quadratic computational complexity. As a result of its poor scalability, hierarchical clustering is restricted to small datasets. On the other hand, partitioning algorithms, such as the well-known k-Means [11] or k-Medoids algorithm [12], are among the most used algorithms in this domain. Hindawi Publishing Corporation e Scientiﬁc World Journal Volume 2014, Article ID 562194, 12 pages http://dx.doi.org/10.1155/2014/562194