J Intell Inf Syst (2014) 42:531–566 DOI 10.1007/s10844-013-0290-3 TS-stream: clustering time series on data streams assio M. M. Pereira · Rodrigo F. de Mello Received: 3 July 2013 / Revised: 13 November 2013 / Accepted: 15 November 2013 / Published online: 1 December 2013 © Springer Science+Business Media New York 2013 Abstract The current ability to produce massive amounts of data and the impossibility in storing it motivated the development of data stream mining strategies. Despite the pro- posal of many techniques, this research area still lacks in approaches to mine data streams composed of multiple time series, which has applications in finance, medicine and science. Most of the current techniques for clustering streaming time series have a serious limitation in their similarity measure, which are based on the Pearson correlation. In this paper, we show the Pearson correlation is not capable of detecting similarities even for classic time series models, such as those by Box and Jenkins. This limitation motivated our proposal to cluster streaming time series based on their generating functions, which is achieved by con- sidering features obtained using descriptive measures, such as Auto Mutual Information, the Hurst Exponent and several others. We present a new tree-based clustering algorithm, entitled TS-Stream, which uses the extracted features to produce partitions in better accor- dance to the time series generating functions. Experiments with synthetic data sets confirm TS-Stream outperforms ODAC, currently the most popular technique, in terms of clustering quality. Using real financial time series from the NYSE and NASDAQ, we conducted stock trading simulations employing TS-Stream to support the creation of diversified investment portfolios. Results confirmed TS-Stream increased the monetary returns in several orders of magnitude when compared to trading strategies simply based on the Moving Average Convergence Divergence financial indicator. Keywords Data streams · Clustering · Time series · Decision trees C. M. M. Pereira () · R. F. de Mello Institute of Mathematical and Computer Sciences–ICMC–USP, University of Sao Paulo, Av. Trabalhador s˜ ao-carlense, S˜ ao Carlos 400 13566-590, SP, Brazil e-mail: cpereira@icmc.usp.br R. F. de Mello e-mail: mello@icmc.usp.br