International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 02 Issue: 03 | June-2015 www.irjet.net p-ISSN: 2395-0072
© 2015, IRJET.NET- All Rights Reserved Page 430
Efficient Periodicity Mining using Circular Autocorrelation in Time
Series Data
Y. B. Malode
1
, D. B. Khadse
2
, D. V. Jamthe
3
1
Asst. Professor, Information Technology Department, PBCOE, M.H., India
2
Asst. Professor, Computer Science & Engineering Department, PBCOE, M.H., India
3
Asst. Professor, Computer Science & Engineering Department, PBCOE, M.H., India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract – This paper focused on symbol, segment
partial periodicity mining. Here, we proposed an
algorithm that can detect periodic pattern through
extracting a set of candidate periods featured in time
series utilizing circular autocorrelation. The proposed
algorithms are used to detect all periodicities in time
series without any previous knowledge of nature of data.
Moreover, the proposed algorithms are discovered the
periodic patterns for conservative set periods.
Experimental results show that the proposed algorithms
are highly accurate with respect to the discovered
periodicity rates and periodic patterns. Real-data
experiments demonstrate the practicality of the
discovered periodic patterns.
Key Words: Time Series Database, Symbol Periodicity,
Segment or Full Cycle Periodicity, Partial periodicity.
1. INTRODUCTION
The periodicity mining in time series database plays
important role in data mining task. It can be used as tool for
forecasting and prediction of the future behavior of time
series. The researchers proposed different algorithms for
periodicity detection in time series databases. A time series
database is a database that contains data over time e.g.
weather data that contains several measure at different
times per day. The pattern mining is an approach to detect
different symbol patterns which consist of combination of
symbols from input symbol set ȋlength of pattern, L η ͳȌ.
The input symbol set is the set of symbols which can be used
to symbolized entire time series. Consider, the set of
transactions, X = {15, 10, 25, 41, 13, 44, 57, 60} ; input
symbol set, ∑={a , b, c, d, e}; the total symbols in ∑ are ͷ ;
interval width = Xmax - Xmin /Total symbols then X is
discretized into symbolized time series, T ={ aabdee} where
symbol a : limit 10 -20, symbol b : limit 21-30, symbol c :
limit 31-40, symbol d : limit 41-50, symbol e : limit 51-60.
Periodic patterns indicate repetitive occurrence of
activity(s), event(s). The repetition count indicates
periodicity of pattern or a symbol. The period is term which
shows interval after which pattern is regularly occurred in
time series. Periodicity mining is analysis of time series data
to detect recurring patterns. Other side of periodicity mining
is the symbolization which needs more attention. The time
series is mostly symbolized before it is analyzed. The basic
idea behind the symbolization is to shorten and speed up the
analysis. The analysis of time series without symbolization is
tedious stuff and time consuming because periodicity mining
is a concern with analysis of large volume of time series. In
this paper , we focused on symbol, segment and partial
periodicity mining which specify the behavior of time series.
Symbol Periodicity
The time series (T) may have symbol periodicity if any
symbol from input symbol set ∑ is recurring with period P
in time series T at most of the positions specified by stPos + I
x P where P = 1,..., length(T)-ͳ ; stPos + ) x P ζ lengthȋTȌ ; ) η
0.
Consider, Symbolized time series (T) = {abcbdbecbdbc}
Here, symbol b is repeated with regular interval 2 and
starting position (stPos) is 2 and end position (endPos) is 11.
As per periodicity theory, if P = 2 , stPos = 2, length(T) =12
then symbol should repeated at positions 2, 4, 6, 8, 10, 12
but practically it is repeated at position 2, 4, 6,9, 11.This
example shows that any symbol or segment which is
repeated at other position than expected position but it
retain same interval (period) for almost all its actual
positions then it shows symbol or segment periodicity.
Segment Periodicity
The time series (T) may have segment periodicity if any
segment which can be a any combination of symbol from
input symbol set ∑ is recurring with period P in time series ,
where P = 2,..., length(T)/2.
Consider, Symbolized time series
T = {abcabdabecedabb}
Here, segment ab is recurring at positions 1, 4, 7, 13; stPos
=1, endPos=14, P = 3. The expected periodicity for segment
ab should be 5 but actual periodicity is less than 5. It shows
imperfect segment periodicity.