CBT-fi: Compact BitTable Approach for Mining Frequent Itemsets A.Saleem Raja 1 and E.George Dharma Prakash Raj 2 1 Research Scholar, Department of Computer Science, Engineering and Technology, Bharathidasan University, Trichy, Tamil Nadu, India. Email: asaleemrajasec@gmail.com 2 Assistant Professor, Department of Computer Science, Engineering and Technology, Bharathidasan University, Trichy, Tamil Nadu, India. Abstract Frequent item-set mining is a data analysis method which is used to find the relationship between the different items in the given database. Plenty of research work and progress has been made over the decades due to its wider applications. Recently, BitTableFI and Index-BitTableFI approaches have been applied for mining frequent item-sets and results are significant. They use Bit Table as the base data structure and exploits the bit table both horizontally and vertically. However still needs simple and efficient approach for mining frequent itemsets from the given dataset. This paper introduces the Compact BitTable approach for mining frequent itemsets (CBT-fi) which clusters(groups) the similar transaction into one and forms a compact bit-table structure which reduces the memory consumption as well as frequency of checking the itemsets in the redundant transaction. Finally we present result, which shows the proposed algorithm has better than the existing algorithms. Keywords: Frequent Itemset Mining, Bit-Table, Association Rule Mining, BitTableFI 1. Introduction Goal of the data mining is to discover potentially useful information embedded in databases. Association rule mining is the one of the data mining technique which was introduced in 1993[2]. It finds the interesting association and/or correlation relationships among large set of data items [9]. Mining frequent itemset is the primary task in mining association rules. A typical and widely used example of frequent item-sets mining is to analyze supermarket transaction data, that is, to examine customer behavior in terms of the purchased products. Frequent sets of products describe how often items are purchased together. In addition to this frequent itemset mining have applications in areas such as bioinformatics, fraud detection and web usage mining [5]. Many algorithms have been proposed to find frequent item-sets. They can be grouped into following categories [7,8]. a) Candidate generation and test approach. Example: Apriori[2] and BitTableFI [3] b) Pattern growth approach. Example: FP- growth[4] c) Hybrid approach. Example: Eclat[10] and Index-BitTableFI[6]. Even though many algorithms have been proposed recent years, FI mining is remains challenging task due its complexity. Therefore simple and computationally efficient algorithms are desirable. This paper introduces CBT-fi, which uses simple and efficient data structure called compact BitTable for storing clustered transaction. The compact BitTable contains only unique transactions with record-count-vector (rcv)and bit-count-vector(bcv) used to find the frequent itemsets with less number of iterations. The rest of the paper is organized as follows. Section 2 presents related works. The proposed algorithm and example of this algorithm in section 3 and Section 4 presents the result of experiments. Finally we conclude the paper. 2. Related Work The Apriori[2], FP-growth[4] algorithms are the base algorithms for many latest FI mining algorithms. Apriori uses an efficient candidate generation method such that each level uses the candidate itemsets which are generated in its previous level. However it requires multiple database scanning for generating FI. FP-growth is a representative pattern growth approach. It is a Depth First Approach (DFS) and uses a special data structure, FP-Tree, for compact representation of the original database. Only two database scans are needed for the algorithm and no candidate generation is required. This makes the FP- growth method much faster than Apriori. But FP-tree construction for large dataset become complex. Many research works has been made over the decades to improve the efficiency of FI mining. ACSIJ Advances in Computer Science: an International Journal, Vol. 3, Issue 5, No.11 , September 2014 ISSN : 2322-5157 www.ACSIJ.org 72 Copyright (c) 2014 Advances in Computer Science: an International Journal. All Rights Reserved.