L. Wang and Y. Jin (Eds.): FSKD 2005, LNAI 3613, pp. 1171 – 1174, 2005.
© Springer-Verlag Berlin Heidelberg 2005
Preventing Meaningless Stock Time Series Pattern
Discovery by Changing Perceptually
Important Point Detection
Tak-chung Fu
1,2,†
, Fu-lai Chung
1
, Robert Luk
1
, and Chak-man Ng
2
1
Department of Computing, The Hong Kong Polytechnic University, Hong Kong.
{cstcfu, cskchung, csrluk}@comp.polyu.edu.hk
2
Department of Computing and Information Management
Hong Kong Institute of Vocational Education (Chai Wan), Hong Kong.
cmng@vtc.edu.hk
Abstract. Discovery of interesting or frequently appearing time series patterns
is one of the important tasks in various time series data mining applications.
However, recent research criticized that discovering subsequence patterns in
time series using clustering approaches is meaningless. It is due to the presence
of trivial matched subsequences in the formation of the time series subse-
quences using sliding window method. The objective of this paper is to propose
a threshold-free approach to improve the method for segmenting long stock
time series into subsequences using sliding window. The proposed approach fil-
ters the trivial matched subsequences by changing Perceptually Important Point
(PIP) detection and reduced the dimension by PIP identification.
1 Introduction
When time series data are divided into subsequences, interesting patterns can be dis-
covered and it is easier to query, understand and mine them. Therefore, the discovery
of frequently appearing time series patterns, or called surprising patterns in paper [1],
has become one of the important tasks in various time series data mining applications.
For the problem of time series pattern discovery, a common technique being em-
ployed is clustering. However, applying clustering approaches to discover frequently
appearing patterns is criticized as meaningless recently when focusing on time series
subsequence [2]. It is because when sliding window is used to discretize the long time
series into subsequences given with a fixed window size, trivial match subsequences
always exist. The existing of such subsequences will lead to the discovery of patterns
derivations from sine curve. A subsequence is said to be a trivial match when it is
similar to its adjacent subsequence formed by sliding window, the best matches to a
subsequence, apart from itself, tends to be the subsequence that begin just one or two
points to the left or the right of the original subsequence [3]. Therefore, it is necessary
to prevent the over-counting of these trivial matches. For example, in Fig.1, the
shapes of S
1
, S
2
and S
3
are similar to a head and shoulders (H&S) patterns while the
†
Corresponding Author.