An incremental privacy-preservation algorithm for the (k, e)-Anonymous model q Bowonsak Srisungsittisunti ⇑ , Juggapong Natwichai Computer Engineering Department, Faculty of Engineering, Chiang Mai University, Chiang Mai, Thailand article info Article history: Received 16 December 2013 Received in revised form 14 October 2014 Accepted 15 October 2014 Available online xxxx Keywords: Privacy preservation Anonymity Incremental algorithm Privacy breach abstract An important issue to be addressed when data are to be published is data privacy. In this paper, the problem of data privacy based on a prominent privacy model, ðk; eÞ-Anonymous, is addressed. Our scenario is that when a new dataset is to be released, there may be, at the same time, datasets that were released elsewhere. A problem arises because some attack- ers might obtain multiple versions of the same dataset and compare them with the newly released dataset. Although the privacy of all of the datasets has been well-preserved indi- vidually, such a comparison can lead to a privacy breach, which is a so-called ‘‘incremental privacy breach’’. To address this problem effectively, we ﬁrst study the characteristics of the effects of multiple dataset releases with a theoretical approach. It has been found that a privacy breach that is subjected to an increment occurs when there is overlap between any parts of the new dataset with any parts of an existing dataset. Based on our proposed studies, a polynomial-time algorithm is proposed. This algorithm needs to consider only one previous version of the dataset, and it can also skip computing the overlapping parti- tions. Thus, the computational complexity of the proposed algorithm is reduced from Oðn m Þ to only Oðpn 3 Þ where p is the number of partitions, n is the number of tuples, and m is the number of released datasets. At the same time, the privacy of all of the released datasets as well as the optimal solution can be always guaranteed. In addition, experiment results that illustrate the efﬁciency of our algorithm on real-world datasets are presented. Ó 2014 Elsevier Ltd. All rights reserved. 1. Introduction Privacy preservation should be the ﬁrst priority to be addressed for data sharing between business collaborators. Over the past decade, various privacy preservation techniques have been proposed [1–5]. When the data are to be shared, such tech- niques can be applied prior to sharing, and the privacy-preserved data can be used for such purposes. However, the data are often changed all the time. Applying the privacy preservation techniques to the data each time can result in different ver- sions of privacy-preserved data. Comparing them can lead to a privacy breach, which is called an incremental privacy breach [6]. In this paper, we present an algorithm for preserving the privacy of the data when the data are not static, i.e., when the records are appended continuously. The focused privacy preservation model is based on (k, e)-Anonymous [5], which is one of the most prominent models. http://dx.doi.org/10.1016/j.compeleceng.2014.10.007 0045-7906/Ó 2014 Elsevier Ltd. All rights reserved. q Reviews processed and recommended for publication to the Editor-in-Chief by Associate Editor Dr. Felix Gomez Marmol. ⇑ Corresponding author. E-mail addresses: bowonsak.s@gmail.com (B. Srisungsittisunti), juggapong@eng.cmu.ac.th (J. Natwichai). Computers and Electrical Engineering xxx (2014) xxx–xxx Contents lists available at ScienceDirect Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng Please cite this article in press as: Srisungsittisunti B, Natwichai J. An incremental privacy-preservation algorithm for the (k, e)-Anonymous model. Comput Electr Eng (2014), http://dx.doi.org/10.1016/j.compeleceng.2014.10.007