International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-8 Issue-2, July 2019 3885 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Retrieval Number: A1920058119/19©BEIESP DOI: 10.35940/ijrte.A1920.078219 Abstract: Closed item sets are frequent itemsets that uniquely determines the exact frequency of frequent item sets. Closed Item sets reduces the massive output to a smaller magnitude without redundancy. In this paper, we present PSS-MCI, an efficient candidate generate based approach for mining all closed itemsets. It enumerates closed item sets using hash tree, candidate generation, super-set and sub-set checking. It uses partitioned based strategy to avoid unnecessary computation for the itemsets which are not useful. Using an efficient algorithm, it determines all closed item sets from a single scan over the database. However, several unnecessary item sets are being hashed in the buckets. To overcome the limitations, heuristics are enclosed with algorithm PSS-MCI. Empirical evaluation and results show that the PSS-MCI outperforms all candidate generate and other approaches. Further, PSS-MCI explores all closed item sets. Index Terms: data mining, frequent itemsets, closed itemset, minimum support. I. INTRODUCTION Nowadays, huge amounts of data are collected from various resources and available to everyone. Due to the complexity of data and the need of various applications, the extraction of interested information such collection is an active research area. Data mining is an active research are in retrieving hidden, valuable and unknown information from a large collection of data or database. In that, Frequent Itemset Mining (FIM) is one of the popular data mining technique that aims at extracting itemsets that are highly correlated as hidden knowledge from a transactional database. FIM is formally formulated as, from a given list of transactions, minimum threshold, find all the itemsets whose occurrence is at least minimum support count. FIM goal is to find the Frequent Itemsets (FI), a set of items whose occurrence is greater than the minimum support of all transactions. One of the basic applications is market basket analysis [3], where each transaction corresponds to a set of products purchased by a customer. To analyze the purchase behavior, find a set of products which occurs together in a minimum threshold percentage of transactions. It can be mapped to many real scenarios of applications, it is mapped to other topics Frequent Episode Mining, Sequential Pattern Mining, Classification and Clustering. In FIM, several approaches have been prosed for FIM [4], classified into two groups, they are CGAT (candidate generation and test) and other is without candidate generation that is FP-Growth [13]. The reputed algorithm under first category is Apriori [2, 3], which runs on the heuristic Apriori and anti-monotonic property. The second category is based on tree concept rather than Revised Manuscript Received on July 05, 2019. U. Mohan Srinivas, Research Scholar, CSE, ANUCET, Guntur, India. E. Srinivasa Reddy, Prof. and Dean, CSE, ANU, UCET, Guntur, India . candidates, where the entire data base is represented in a tree and do mine tree recursively to extract all frequent itemsets. However, the number of FI’s that are extracted from large databases can be huge which requires huge storage area and more computations. For example, Table 1 is recorded with a list of 4 transactions, consider the minimum support min-sup=50% (count = 2). FIM { a:3, b:3, c:2, ab:2, ac:2, ad:2, bc:2, bd:2, abc:2}. As per the definition of closed itemset, it is observed that {c:2, ab:2, ac:2, bc:2} can be determined from {abc:2}. Hence {c:2, ab:2, ac:2, bc:2} are considered as redundant. Table 1: Sample Transactional Database As a result, several condensed representations for FI’s have been proposed to reduce the size of FI’s without losing knowledge [8]. The very next alternation method was Maximal Itemsets, a set of itemsets whose support reaches the threshold and doesn’t have any superset. It has shown very impact on the size of FI’s. MaxClique, Mafia [6], Pincer search [16], Maxminer [Bayardo 98], Depth project [3], Mafia [6], GenMax [12] and FPMax [12]. All the above algorithms are able to extract all the maximal itemsets. However, multiple scans of database was needed when the main memory size was small and too many possible itemsets were generated at each pass. However, extracting frequent information with exact support is not achievable. Further, it has been investigated, the result with the term Closed Itemsets CI. CI is a set of itemsets which doesn’t have any supersets with the same support. The research including top-down approaches [7, 5, 20], Bottom-up approaches and combination of both is Pincer search [16, 17].. The above approaches have shown the output contains all the frequent itemsets. However, multiple scans of database was needed when the main memory size was small and too many possible itemsets were generated at each pass. Contributions: In this paper, we propose a Novel approach called Partition Based Single Scan Approach (PSS-MCI) for Mining Closed Itemsets. Hash Table is used to capture the Possible Frequent Mining Closed Item sets using Partition Based Single Scan Algorithm U. Mohan Srinivas, E. Srinivasa Reddy