A Novel Method for Discovering Fuzzy Sequential
Patterns Using the Simple Fuzzy Partition Method
Ruey-Shun Chen, Yi-Chung Hu
Institute of Information Management, National Chiao Tung University, Hsinchu 300, Taiwan, ROC.
E-mail address: rschen@bis03.iim.nctu.edu.tw (R. -S. Chen)
Sequential patterns refer to the frequently occurring
patterns related to time or other sequences, and have
been widely applied to solving decision problems. For
example, they can help managers determine which
items were bought after some items had been bought.
However, since fuzzy sequential patterns described by
natural language are one type of fuzzy knowledge
representation, they are helpful in building a prototype
fuzzy knowledge base in a business. Moreover, each
fuzzy sequential pattern consisting of several fuzzy
sets described by the natural language is well suited
for the thinking of human subjects and will help to
increase the flexibility for users in making decisions.
Additionally, since the comprehensibility of fuzzy rep-
resentation by human users is a criterion in designing
a fuzzy system, the simple fuzzy partition method is
preferable. In this method, each attribute is partitioned
by its various fuzzy sets with pre-specified member-
ship functions. The advantage of the simple fuzzy par-
tition method is that the linguistic interpretation of
each fuzzy set is easily obtained. The main aim of this
paper is exactly to propose a fuzzy data mining tech-
nique to discover fuzzy sequential patterns by using
the simple partition method. Two numerical examples
are utilized to demonstrate the usefulness of the pro-
posed method.
1. Introduction
Data mining is the exploration and analysis of data in
order to discover meaningful patterns (Berry & Linoff,
1997). Thus knowledge acquisition can be easily
achieved for users by checking these patterns discovered
from databases, and association rule is an important type
of knowledge representation. Agrawal et al. (Agrawal,
Imielinski, & Swami, 1993) initially proposed a method
to find association rules, later proposing the well-known
Apriori algorithm (Agrawal, Mannila, Srikant, Toivonen,
&Verkamo, 1996). In addition to association rules, se-
quential patterns are another important type of knowl-
edge representation, and effective algorithms (i.e. Apri-
oriSome and AprioriAll) for mining sequential patterns
were proposed by Agrawal and Srikant (1995). In addi-
tion, sequential patterns have been widely applied to
solve decision problems. For example, they can help
managers determine which items were bought after some
items had (already) been bought (Han & Kamber, 2001),
or realize browsing orders of homepages in a web site
(Myra, 2000).
Sequential pattern mining is the mining of frequently
occurring patterns related to time or other sequences (Han &
Kamber, 2001), where a sequence is an ordered list of
itemsets (Agrawal & Srikant, 1995). Specially, if there are
k itemsets (k 1) in a frequent sequence whose support is
larger than or equal to the user-specified minimum support,
then we call it a frequent k-sequence. Moreover, a sequen-
tial pattern is a frequent sequence but it is not contained in
another sequence (Agrawal & Srikant, 1995). For example,
a 2-sequence {Banana} , {Apple, Orange} may represent
items Apple and Orange being bought together after item
Banana had been bought, where {Banana} and {Apple,
Orange} are itemsets. Whereas {Banana} , {Apple, Or-
ange} is not contained in the 1-sequence {Banana, Apple,
Orange} since the latter sequence is shorter than the former
sequence.
However, since fuzzy sequential patterns described by
natural language are one type of fuzzy knowledge represen-
tation, they are helpful to build a prototype fuzzy knowledge
base in business. Moreover, fuzzy sequential patterns de-
scribed by the natural language are well suited for the
thinking of human subjects and will help to increase the
flexibility for users in making decisions. Actually, each
Nomenclature K, number of partitions in each quantitative attribute;
k, length of a fuzzy sequence; d, degree of a given relation, where d 1;
A
K,im
xm
, i
m
-th linguistic value of K fuzzy partitions defined in quantitative
attribute x
m
,1 im K;
K,im
xm
, membership function of A
K,im
xm
; n, total
number of customers; c
r
, r-th customer, where 1 r n;
r
, number of
consecutive transactions ordered by transaction-time for c
r
; , total number
of frequent fuzzy grids; t
p
r
, p-th transaction corresponding to c
r
, where t
p
r
=t
p1
r
, t
p2
r
, …, t
pd
r
), and 1 p
r
; L
j
, j-th frequent fuzzy grid, where 1
j .
© 2003 Wiley Periodicals, Inc.
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 54(7):660 – 670, 2003