International Journal of Electrical and Computer Engineering (IJECE) Vol.8, No.6, December2018, pp. 4477~4485 ISSN: 2088-8708, DOI: 10.11591/ijece.v8i6.pp4477-4485 4477 Journal homepage: http://iaescore.com/journals/index.php/IJECE Postdiffset Algorithm in Rare Pattern: An Implementation via Benchmark Case Study Mustafa Man 1 , Wan Aezwani Wan Abu Bakar 2 , Masita Masila Abd. Jalil 3 , Julaily Aida Jusoh 4 1,3 School of Informatics & Applied Mathematics, Universiti Malaysia Terengganu, Malaysia 2,4 Faculty Informatic and Computing, Universiti Sultan Zainal Abidin, Malaysia Article Info ABSTRACT Article history: Received Apr 10, 2018 Revised Jun 12, 2018 Accepted Jun 30, 2018 Frequent and infrequent itemset mining are trending in data mining techniques. The pattern of Association Rule (AR) generated will help decision maker or business policy maker to project for the next intended items across a wide variety of applications. While frequent itemsets are dealing with items that are most purchased or used, infrequent items are those items that are infrequently occur or also called rare items. The AR mining still remains as one of the most prominent areas in data mining that aims to extract interesting correlations, patterns, association or casual structures among set of items in the transaction databases or other data repositories. The design of database structure in association rules mining algorithms are based upon horizontal or vertical data formats. These two data formats have been widely discussed by showing few examples of algorithm of each data formats. The efforts on horizontal format suffers in huge candidate generation and multiple database scans which resulting in higher memory consumptions. To overcome the issue, the solutions on vertical approaches are proposed. One of the established algorithms in vertical data format is Eclat.ECLAT or Equivalence Class Transformation algorithm is one example solution that lies in vertical database format. Because of its ‘fast intersection’, in this paper, we analyze the fundamental Eclat and Eclat- variants such asdiffsetand sortdiffset. In response to vertical data format and as a continuity to Eclat extension, we propose a postdiffset algorithm as a new member in Eclat variants that use tidset format in the first looping and diffset in the later looping. In this paper, we present the performance of Postdiffset algorithm prior to implementation in mining of infrequent or rare itemset. Postdiffset algorithm outperforms 23% and 84% to diffset and sortdiffset in mushroom and 94% and 99% to diffset and sortdiffset in retail dataset. Keyword: Association rule mining Eclat algorithm Frequent itemset Infrequent itemset Vertical databse Copyright © 2018 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Wan Aezwani Wan Abu Bakar, Facultyof Informatic and Computing, Universiti Sultan Zainal Abidin, Besut Campus, 22200 Besut, Terengganu, Malaysia. Email: wanaezwani@unisza.edu.my 1. INTRODUCTION The main objectives of association rules mining are to find the correlations, associations or casual structures among sets of items in the data repository. In other words, it allows non discovery of implicative and interesting tendencies in databases. Frequent itemset and infrequent itemset mining are critical fields in association rule mining. The fields are widely used across a variety of domains such as market basket analysis, remedial, biology, banking or retail services [1], [21]. Frequent or infrequent itemsets may contribute to big data generation. Undoubtedly, the critical issues regarding memory space consumption and data storage capacity will significantly effect prior to frequent or infrequent generation of