HiMod-Pert: Histogram Modification Based Perturbation Approach for Privacy Preserving Data Mining Alpa Kavin Shah 1( ) and Ravi Gulati 2 1 MCA Department, Sarvajanik College of Engineering and Technology, Surat, India alpa.shah@scet.ac.in 2 Department of Computer Science, Veer Narmad South Gujarat University, Surat, India rmgulati@vnsgu.ac.in Abstract. Privacy Preserving Data Mining (PPDM) protects the disclosure of sensitive quasi-identifiers of dataset during mining by perturbing the data. This perturbed dataset is then used by trusted Third Party for effective derivation of association rules. Many PPDM algorithms destroy the original data to generate the mining results. It is essential that the perturbed data preserves the statistical inference of the sensitive attributes and minimize the information loss. Existing techniques based on Additive, Multiplicative and Geometric Transformations have minimal information loss, but suffer from reconstruction vulnerabilities. We propose Histogram Modification based method, viz. HiMod-Pert, for preserving the sensitive numeric attributes of perturbed dataset. Our method uses the differ‐ ence in neighboring values to determine the perturbation factor. Experiments are performed to implement and test the applicability of the proposed technique. Evaluation using descriptive statistic metrics shows that the information loss is minimal. Keywords: Privacy preserving data mining · Histogram Modification Additive white Gaussian noise · Multiplicative perturbation Geometric Data Perturbation 1 Introduction Since last couple of decades, information collection over Internet is witnessing an expo‐ nential growth. More users have started providing their personal information in different Internet based activities like purchases/sales, auctions, entertainment, gaming, online surveys, to name a few. A person can now be easily and accurately linked based on his/ her Internet activities, leading to a serious pose of privacy intrusion to the individuals. This vast pool of data has necessitated the need for efficient data mining protocols. Data mining which was limited and confined to narrower domain of Enterprises and Appli‐ cations now encompasses Big Data and Cloud Computing. Data collection has increased many-folds for research, trend analysis and more often collaborative mining results. It is vital that the information provided by the users should not breach their privacy. This concern has caught attention of researchers and is widely studied for improvements even today. PPDM algorithms tackle this issue by optimizing © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2018 Z. Patel and S. Gupta (Eds.): ICFITT 2017, LNICST 220, pp. 28–36, 2018. https://doi.org/10.1007/978-3-319-73712-6_3