Abstract—Discovery of frequent itemsets is an important problem in Data Mining. Most of the previous research based on Apriori, which suffers with generation of huge number of candidate itemsets and performs repeated passes for finding frequent itemsets. To address this problem, we propose an algorithm for finding frequent K-itemsets in which the itemsets whose length is less than K will be pruned from the database and will not be considered for further processing which reduces the size and number of comparisons to be performed. In addition to this, it generates 1-itemset as a data pre processing step which saves time and makes execution fast. The experimental results are included. Index Terms— Frequent itemset, algorithm, database, apriori, support I. INTRODUCTION Association rule mining is a focused area in today’s data mining research. It usually consists of two phases viz., discovery of frequent itemsets and generation of rules from the discovered frequent itemsets. Finding frequent itemsets has gained popularity because it has more number of applications viz., market basket analysis, catalog design, add-on sales, and store layout and customer segmentation. The efficiency of any algorithm to find the frequent itemset is based on three factors viz., generation of candidate keys, data structures used and way of implementation. Frequent Itemset Mining (FIM) is mainly based on minimum support value, which finds all itemsets with supports no less than a user-specified minimum support threshold. Several algorithms have been proposed [7, 8, 9, 11, 12, 16, 19, 20, 22, 23, 24, 28] in this area. Apriori like algorithms performs more number of scans and generates huge number of candidate keys. To find the frequent K-itemset, it is necessary to start the algorithm from frequent 1-itemset, 2-itemset………. to frequent K-itemset. The proposed method generates frequent K-itemset with a minimum support directly from the database. At the data warehousing level, 1-itemset will be generated while processing a transaction. Prior to the execution of the proposed Manuscript received July 15, 2007. This work was supported in part by the TEQIP. F. A. H. Ravi sankar is with the CTRI, Rajahmundry, India, Phone: 91-9849418571, email : hravisankar@india.com S. B. M.M. Naidu was with Sri Venkateswara Univesity, Tirupati, India. He is now the Vice-Principal and Professor, Department of Computer science and engineering, Phone: 91-9848348473., e-mail: mmnaidu@yahoo.com. algorithm, 1-itemset is available which prunes the first pass from the apriori-like algorithm which in turn saves time. The outline of this paper is as follows: Section II presents definitions of frequent itemset and an outline of apriori algorithm for finding frequent itemsets. Section III reviews the related work done in this area. Section IV explains the motivation for proposing the algorithm. Section V presents the proposed algorithm, database schema and an illustrative example. Section VI gives the experimental results and the conclusions of study are given in section VII. II. BACKGROUND The definitions of frequent itemset are follows. Definition 1.1 [30]: Let ‘I’ be a finite set of attributes called items and D be a finite multi set of transactions. Each transaction T⊆ D is a set of items is usually called an itemset. The length or size of an itemset is the number of items that contains. An itemset of length ‘K’ is referred to as K-itemset. Definition 1.2 [26]: Let L= {I1, I2 …Im} be a set of literals, called items. Let a non empty set of items T be called an itemset. Let D be a set of variable length itemsets, where each itemset T⊆ L. We say that an itemset T supports an item x∈L if x is in T. We say that an itemset T supports an itemset X⊆L if T supports every item in the set X. Each itemset has an associated measure of its statistical significance, called support. The support of the itemset T in the set D is: Support(X, D) = | {T ∈ D | T supports X} | -------------------------------------- | D| In other words, the itemset ‘X’ holds in the set ‘D’ with support‘s’, if‘s’ is the fraction of itemsets in ‘D’ supporting ‘X’. A frequent itemset is an itemset, whose support is above a user-defined threshold. Introduction to Apriori: It is an influential algorithm for mining frequent itemsets for Boolean association rules. This algorithm uses prior knowledge of frequent itemset properties. This algorithm iteratively finds all possible itemsets that have support greater or equal to a given minimum support value. The first pass of the algorithm counts item occurrences to determine the frequent 1-itemsets. In each of the next passes, the frequent itemsets, L k-1 is found in the (K-1)th pass are used to generate the candidate itemsets C K , using apriori-gen function described below. Then the database is scanned and the support of A New Approach for Mining Frequent K-itemset H. Ravi Sankar and M.M. Naidu Proceedings of the World Congress on Engineering and Computer Science 2007 WCECS 2007, October 24-26, 2007, San Francisco, USA ISBN:978-988-98671-6-4 WCECS 2007