J. ICT Res. Appl. Vol. 10, No. 2, 2016, 153-176 153 Received October 30 th , 2015, Revised May 9 th , 2016, Accepted for publication May 31 st , 2016. Copyright © 2016 Published by ITB Journal Publisher, ISSN: 2337-5787, DOI: 10.5614/itbj.ict.res.appl.2016.10.2.5 Mining High Utility Itemsets with Regular Occurrence Komate Amphawan 1,* , Philippe Lenca 2 , Anuchit Jitpattanakul 3 & Athasit Surarerks 4 1 Burapha University, Computational Innovation Laboratory, 20131 Chonburi, Thailand 2 Institut Telecom, Telecom Bretagne, UMR CNRS 3192 Lab-STICC, France 3 Faculty of Applied Science, KNUTNB, 10800 Bangkok, Thailand 4 Chulalongkorn University, ELITE Laboratory, 10330 Bangkok, Thailand *E-mail: komate@gmail.com Abstract. High utility itemset mining (HUIM) plays an important role in the data mining community and in a wide range of applications. For example, in retail business it is used for finding sets of sold products that give high profit, low cost, etc. These itemsets can help improve marketing strategies, make promotions/ advertisements, etc. However, since HUIM only considers utility values of items/itemsets, it may not be sufficient to observe product-buying behavior of customers such as information related to “regular purchases of sets of products having a high profit margin”. To address this issue, the occurrence behavior of itemsets (in the term of regularity) simultaneously with their utility values was investigated. Then, the problem of mining high utility itemsets with regular occurrence (MHUIR) to find sets of co-occurrence items with high utility values and regular occurrence in a database was considered. An efficient single-pass algorithm, called MHUIRA, was introduced. A new modified utility-list structure, called NUL, was designed to efficiently maintain utility values and occurrence information and to increase the efficiency of computing the utility of itemsets. Experimental studies on real and synthetic datasets and complexity analyses are provided to show the efficiency of MHUIRA combined with NUL in terms of time and space usage for mining interesting itemsets based on regularity and utility constraints. Keywords: association rule mining; data mining; high utility itemsets; occurrence behavior; regularity constraint; utility-list structure. 1 Introduction Association rule mining (ARM) [1,2] is a fundamental task of data mining and data analysis. It aims to discover a relationship between objects or events, which is expressed in the form of a → rule. For example, from purchasing data of a retail business, ARM may discover the rule “ܤܦ→ݎ݌ ݎሾ :ݏ30%, : 60% ሿ” which expresses buying behavior of customers, i.e. 30% of customers bought ܤ ݎsimultaneously with ܦ݌ ݎand 60% of customers who bought ܤ ݎ also bought ܦ݌ ݎat the same time. ARM can be applied in several areas such as retail marketing, web clickstream analysis and DNA analysis. ARM consists