A Novel Method for Discovering Fuzzy Sequential Patterns Using the Simple Fuzzy Partition Method Ruey-Shun Chen, Yi-Chung Hu Institute of Information Management, National Chiao Tung University, Hsinchu 300, Taiwan, ROC. E-mail address: rschen@bis03.iim.nctu.edu.tw (R. -S. Chen) Sequential patterns refer to the frequently occurring patterns related to time or other sequences, and have been widely applied to solving decision problems. For example, they can help managers determine which items were bought after some items had been bought. However, since fuzzy sequential patterns described by natural language are one type of fuzzy knowledge representation, they are helpful in building a prototype fuzzy knowledge base in a business. Moreover, each fuzzy sequential pattern consisting of several fuzzy sets described by the natural language is well suited for the thinking of human subjects and will help to increase the flexibility for users in making decisions. Additionally, since the comprehensibility of fuzzy rep- resentation by human users is a criterion in designing a fuzzy system, the simple fuzzy partition method is preferable. In this method, each attribute is partitioned by its various fuzzy sets with pre-specified member- ship functions. The advantage of the simple fuzzy par- tition method is that the linguistic interpretation of each fuzzy set is easily obtained. The main aim of this paper is exactly to propose a fuzzy data mining tech- nique to discover fuzzy sequential patterns by using the simple partition method. Two numerical examples are utilized to demonstrate the usefulness of the pro- posed method. 1. Introduction Data mining is the exploration and analysis of data in order to discover meaningful patterns (Berry & Linoff, 1997). Thus knowledge acquisition can be easily achieved for users by checking these patterns discovered from databases, and association rule is an important type of knowledge representation. Agrawal et al. (Agrawal, Imielinski, & Swami, 1993) initially proposed a method to find association rules, later proposing the well-known Apriori algorithm (Agrawal, Mannila, Srikant, Toivonen, &Verkamo, 1996). In addition to association rules, se- quential patterns are another important type of knowl- edge representation, and effective algorithms (i.e. Apri- oriSome and AprioriAll) for mining sequential patterns were proposed by Agrawal and Srikant (1995). In addi- tion, sequential patterns have been widely applied to solve decision problems. For example, they can help managers determine which items were bought after some items had (already) been bought (Han & Kamber, 2001), or realize browsing orders of homepages in a web site (Myra, 2000). Sequential pattern mining is the mining of frequently occurring patterns related to time or other sequences (Han & Kamber, 2001), where a sequence is an ordered list of itemsets (Agrawal & Srikant, 1995). Specially, if there are k itemsets (k 1) in a frequent sequence whose support is larger than or equal to the user-specified minimum support, then we call it a frequent k-sequence. Moreover, a sequen- tial pattern is a frequent sequence but it is not contained in another sequence (Agrawal & Srikant, 1995). For example, a 2-sequence {Banana} , {Apple, Orange}may represent items Apple and Orange being bought together after item Banana had been bought, where {Banana} and {Apple, Orange} are itemsets. Whereas {Banana} , {Apple, Or- ange}is not contained in the 1-sequence {Banana, Apple, Orange}since the latter sequence is shorter than the former sequence. However, since fuzzy sequential patterns described by natural language are one type of fuzzy knowledge represen- tation, they are helpful to build a prototype fuzzy knowledge base in business. Moreover, fuzzy sequential patterns de- scribed by the natural language are well suited for the thinking of human subjects and will help to increase the flexibility for users in making decisions. Actually, each Nomenclature K, number of partitions in each quantitative attribute; k, length of a fuzzy sequence; d, degree of a given relation, where d 1; A K,im xm , i m -th linguistic value of K fuzzy partitions defined in quantitative attribute x m ,1 im K; K,im xm , membership function of A K,im xm ; n, total number of customers; c r , r-th customer, where 1 r n; r , number of consecutive transactions ordered by transaction-time for c r ; , total number of frequent fuzzy grids; t p r , p-th transaction corresponding to c r , where t p r =t p1 r , t p2 r , …, t pd r ), and 1 p r ; L j , j-th frequent fuzzy grid, where 1 j. © 2003 Wiley Periodicals, Inc. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 54(7):660 – 670, 2003