29
Visual Analysis of Sequences Using Fractal Geometry
Noa Ruschin Rimini and Oded Maimon
Department of Industrial Engineering, Tel-Aviv University, Tel-Aviv, Israel
Summary. Sequence analysis is a challenging task in the data mining arena, relevant for
many practical domains. We propose a novel method for visual analysis and classification of
sequences based on Iterated Function System (IFS). IFS is utilized to produce a fractal repre-
sentation of sequences. The proposed method offers an effective tool for visual detection of
sequence patterns influencing a target attribute, and requires no understanding of mathemat-
ical or statistical algorithms. Moreover, it enables to detect sequence patterns of any length,
without predefining the sequence pattern length. It also enables to visually distinguish between
different sequence patterns in cases of reoccurrence of categories within a sequence. Our pro-
posed method provides another significant added value by enabling the visual detection of rare
and missing sequences per target class.
29.1 Introduction
Mining sequential data is an important challenge relevant for many practical domains, such as
analysis of the impact of operation sequence on product quality (See Da Cunha et al., 2006,
Rokach et al., 2008 and Ruschin-Rimini et al., 2009) analysis of customers purchase history
for determining the next best offer, analysis of products failure history for the purpose of root
cause analysis, security (Moskovitch et al., 2008) and more.
This chapter presents a novel approach for detecting sequence patterns that influence a
target attribute and therefore act as sequence classifiers. It extends existing methods by pro-
viding a visual application, enabling domain experts such as production engineers, sales and
customer service managers, to visually detect sequence patterns that affect a target attribute.
Moreover, the proposed method overcomes limitations of existing methods such as the
n-gram approach utilized by Da Cunha et al. (2006), by enabling the detection of sequence
patterns of any length, without predefining the pattern length, and by enabling to visually
distinguish between different sequence patterns, even in cases of reoccurrences of categories
in a sequence. The proposed method provides another significant added value by enabling the
visual detection of rare and missing sequences per target attribute value.
The proposed approach is based on Iterated Function System (IFS) for producing a Fractal
representation of sequences.
In particular, we developed a unique software application for visual detection of sequence
patterns. Our application comprises such features as color codes and zoom functions, in order
O. Maimon, L. Rokach (eds.), Data Mining and Knowledge Discovery Handbook, 2nd ed.,
DOI 10.1007/978-0-387-09823-4_29, © Springer Science+Business Media, LLC 2010