DOI: 10.4018/IJIRR.2017100103 International Journal of Information Retrieval Research Volume 7 • Issue 4 • October-December 2017 Copyright © 2017, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. Frequent Itemset Mining in Large Datasets a Survey Amrit Pal, Indian Institute of Information Technology, Allahabad, India Manish Kumar, Indian Institute of Information Technology, Allahabad, India ABSTRACT Frequent Itemset Mining is a well-known area in data mining. Most of the techniques available for frequent itemset mining requires complete information about the data which can result in generation of the association rules. The amount of data is increasing day by day taking form of BigData, which require changes in the algorithms for working on such large-scale data. Parallel implementation of the mining techniques can provide solutions to this problem. In this paper a survey of frequent itemset mining techniques is done which can be used in a parallel environment. Programming models like Map Reduce provides efficient architecture for working with BigData, paper also provides information about issues and feasibility about technique to be implemented in such environment. KeywoRDS BigData, Count, Frequent Itemset, HDFS, Map Reduce, Mapper, Reducer, Set INTRoDUCTIoN The amount of data is increasing day by day this increase in the size of data, developing some basic challenges for the frequent itemset mining algorithms. As the size of the increase the amount of time required to process the data will also increase. Millions of customers visit Walmart daily, resulting in the generation of millions of transactions. Every hour Walmart generates approximately 2.5 petabytes of data (DeZyre, 2016). Social network websites generating huge amount of unstructured data daily. Managing this huge amount of unstructured data using the conventional technique is a challenging task. The amount of data when it becomes that much in size that it becomes difficult to manage it using conventional data management systems, then it is called Big Data (Manyika, 2011). Transaction datasets are also increasing in size and taking the shape of Big Data. There are algorithms available for mining of the frequent itemsets from transactional datasets like Apriori (Agrawal, 1994), FP- Growth etc. There can be different approaches for mining the frequent itemsets from the transactional datasets, sequential and parallel approaches. Most of the available frequent itemset mining algorithms consider the sequential approach. There are some basic requirement in processing the data for frequent itemsets. These are counting the number of transactions, counting different items in the itemset, maintain a list of items, count of the total number of transactions and complete scan of the datasets. The basic terminology of the frequent itemset mining is calculating the support of each itemset. Algorithms are required to scan the 37