Computer Engineering and Intelligent Systems www.iiste.org ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online) Vol.5, No.12, 2014 18 Algorithmic Framework for Frequent Pattern Mining with FP-Tree Georgina N. Obunadike 1&2* Audu Isah 2 Arthur Umeh 3 H. C. Inyiamah 4 1. Department of Mathematical Sciences and IT, Federal University, Dutsin-ma, Katsina State 2. Department of Mathematics and Statistics, Federal University of Technology, Minna, Niger State 3. Department of Information and Media Technology, Federal University of Technology, Minna, Niger State 4. Department of Computer Electronics, NNamdi Azikiwe University, Awka, Anambra State *Email: nkoliobunadike@yahoo.com Abstract The FP-tree algorithm is currently one of the fastest approaches to frequent item set mining. Studies have also shown that pattern-growth method is one of the most efficient methods for frequent pattern mining. It is based on a prefix tree representation of the given database of transactions (FP-tree) and can save substantial amounts of memory for storing the database. The basic idea of the FP-growth algorithm can be described as a recursive elimination scheme which is usually achieved in the preprocessing step by deleting all items from the transactions that are not frequent. In this study, a simple framework for mining frequent pattern is presented with FP-tree structure which is an extended prefix-tree structure for mining frequent pattern without candidate generation, and less cost for better understanding of the concept for inexperienced data analysts and other organizations interested in association rule mining. Keywords: Association Rule, Frequent Pattern Mining, Apriori Algorithm, FP-tree 1. Introduction Data mining is the process of discovering interesting knowledge from large amounts of data stored in databases, data warehouses, or other information repositories. Data mining is an emerging field that has gained attention in research and industry and has recently also attracted considerable attention from database practitioners, researchers and data analysts. It has application in many fields such as marketing, financial forecasts and decision support (Jiawei, Micheline, and Jian, 2011). Data-mining algorithms and visualization tools are being used to find important patterns in data and to create useful forecasts. This technology is being applied in virtually all business sections including banking, telecommunication, manufacturing, marketing, and e-commerce (Zhao Hui and Jamie, 2005). In performing data mining tasks such as association, clustering, classification/prediction and outlier detection, various methods and techniques are used for knowledge discovery from databases. Association rule is one of the important tasks of data mining which identifies interesting association and correlation among large data sets (Intan and Rolly, 2005). Mining frequent patterns is an important aspect in association rule mining. The FP-tree algorithm is currently one of the fastest approaches to frequent item set mining. FP-Tree was first proposed by Han, Pei, and Yin (2000). FP-Tree is a compact representation of transaction database that contains frequency information of all relevant patterns in a dataset. Association rule mining has many important applications in life. An association rule is of the form X => Y, and each rule has two measurements: support and confidence. The association rule mining problem is to find rules that satisfy user-specified minimum support and minimum confidence. It mainly includes two steps: finding all frequent patterns; and generating association rules through the frequent patterns. The identification of sets of items, products, symptoms, characteristics, and so forth, which often occur together in a given database, can be seen as one of the most basic tasks in Data Mining (Intan and Rolly, 2005). Let be a set of item. Let be a set of database transactions where each transaction is a nonempty item set such that . Each transaction has an identifier, called a . Let be a set of items. A transaction is said to contain if . An association rule is of the form A B, where . The rule holds in the transaction set with support , where is the percentage of transactions in that contain (i.e., the union of sets A and B). This is taken to be the probability, The rule has confidence in the transaction set , where is the percentage of transactions in containing that also contain . This is taken to be the conditional probability, That is: