Association rule hiding in risk management for retail supply chain collaboration Hai Quoc Le a, *, Somjit Arch-int a , Huy Xuan Nguyen b , Ngamnij Arch-int a a Department of Computer Science, Faculty of Science, Khon Kaen University, Khon Kaen 40002, Thailand b Institute of Information Technology, Institute of Science and Technology of Vietnam, 18 Hoang Quoc Viet Road, Cau Giay District, Hanoi, Viet Nam 1. Introduction In today’s competitive environment, collaboration between retailers and suppliers is a requirement for both parties’ development [1]. Successful collaboration can bring products to market faster, reduce production and logistics costs, drive market share, and increase sales [2]. In retail supply chain collaboration, the discovery of interesting association relationships among large amounts of business transaction records can help in many business decision-making processes, such as catalog design, cross-market- ing, cross-selling, and inventory control [3,4]. Data sharing becomes important in the development of every member and partnership involved in collaboration [5]. However, data sharing may lead to the leakage of sensitive knowledge that companies are motivated and able to collect, analyze, acquire, and utilize to gain a competitive edge [6,7]. This knowledge leakage in turn creates serious risks for enterprises that share their data. Security policy therefore plays an important role in the common business process in the risk management of enterprises [8–10]. Risk management in supply chain collaboration has received much attention in the past decade. Gouriveau and Noyes [11] formalized an object model based on generic items used in risk management. Wu et al. [12] proposed an inbound supply risk analysis methodology to classify, manage and assess the risks. Wu and Olson [13] developed three types of risk evaluation models within supply chains: CCP, DEA and MOP. Tuncel and Alpan [14] integrated the risk management procedures into the design, planning, and performance evaluation process of supply chain networks through Petri net (PN) based simulation. Giannakisa and Louis [15] proposed a framework for the management disruptions and mitigation of risks in manufacturing supply chains based on a multi-agent decision support system. Wulan and Petrovic [16] proposed a fuzzy logic based system for risk analysis and evaluation. Association rule hiding is an emerging area of data mining that aims to transform an original database into a released database such that the sensitive association rules, which are used to make decisions, cannot be discovered, whereas the non-sensitive association rules can still be mined. Previous studies in association rule hiding mainly focused on proposing optimal algorithms for hiding sensitive association rules with the least significant side effects, which is defined as the impact of the hiding process on the results of association rule mining and includes the lost rule, ghost rule, false rule, and accuracy. The border-based approach focuses on the weight of the positive border [17] or the maxmin set [18] to reduce the support of itemsets in the revised negative border until they are under the minimum support threshold. At the same time, the border-based approach tries to protect the expected positive border in order to maintain the non-sensitive itemsets. The exact approach [19,20] formulates the Constrain Satisfaction Problem (CSP) to find a global optimal solution for hiding a set of sensitive frequent itemsets. Both the border-based and exact approaches have achieved good results when hiding a set of frequent itemsets. Computers in Industry xxx (2013) xxx–xxx A R T I C L E I N F O Article history: Received 15 April 2013 Accepted 23 April 2013 Available online xxx Keywords: Association rule hiding Risk management Data sharing Association rule mining Retail supply chain collaboration A B S T R A C T Association rule hiding is an efficient solution that helps enterprises avoid the risk caused by sensitive knowledge leakage when sharing data in their collaborations. This study examines how data sharing has the potential to create risk for enterprises in retail supply chain collaboration and proposes a new algorithm to remove sensitive knowledge from the released database based on the intersection lattice of frequent itemsets. The proposed algorithm specifies the victim item such that the modification of this item causes the least impact on frequent itemsets and the non-sensitive association rule. In the experiment described in this paper, this algorithm is used in risk avoidance for a retailer sharing data in retail supply chain collaboration. The results indicate that our approach is applicable in a real context and outperforms previous mechanisms. ß 2013 Elsevier B.V. All rights reserved. * Corresponding author. Tel.: +66 896191827. E-mail addresses: hai_lq@qtttc.edu.vn, hailq79@yahoo.com (H.Q. Le), somjit@kku.ac.th (S. Arch-int), nxhuy564@gmail.com (H.X. Nguyen), ngamnij@kku.ac.th (N. Arch-int). G Model COMIND-2461; No. of Pages 9 Please cite this article in press as: H.Q. Le, et al., Association rule hiding in risk management for retail supply chain collaboration, Comput. Industry (2013), http://dx.doi.org/10.1016/j.compind.2013.04.011 Contents lists available at SciVerse ScienceDirect Computers in Industry jo ur n al ho m epag e: ww w.els evier .c om /lo cat e/co mp in d 0166-3615/$ see front matter ß 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.compind.2013.04.011