IJSRSET19629 | Received : 03 March 2019 | Accepted : 13 March 2019 | March-April -2019 [ 6 (2) : 84-90 ] © 2019 IJSRSET | Volume 6 | Issue 2 | Print ISSN: 2395-1990 | Online ISSN : 2394-4099 Themed Section : Engineering and Technology DOI : https://doi.org/10.32628/IJSRSET19629 84 A Study on Models and Techniques of Anonymization in Data Publishing Shipra Sharma, Naveen Choudhary, Kalpana Jain Department of CSE, College of Technology and Engineering, Udaipur, Rajasthan, India ABSTRACT In the era where world runs online the storing and publishing of data online has also increased to a great extent. In this era a large amount of information is collected and published to a network which is publically available. With the exposure of data comes the risk of information leakage of an individual while publishing the data online. Hence for the same we need a security system for preserving the privacy of individual and here the concept of preserving privacy in data publishing came into existence. To achieve this privacy different privacy models and techniques have been proposed which gives different levels of resistance against different attacks by adversaries. In this paper we will discuss about these models and techniques and have a comparative study among them. Keywords : Privacy Models, Anonymization Techniques, Data Publishing, Privacy Preservation. I. INTRODUCTION The publishing of data involves providing the data for public use for further research, study or surveys. But when the data is published the identity of individuals must be preserved to maintain the privacy. This procedure of maintaining the privacy results in loss of information of data and decreases its utility. So the major challenge in this field is to preserve the privacy with minimum data loss. During the publishing of data we modify the data in such a way that it does not lead to identity leak of an individual and make it anonymous is a process called anonymization. But before anonymization of data we need to understand different type of data which exists. 1. Identifier: The fields or values which uniquely identify an individual are called Identifier. For example name, social security number. 2. Quasi Identifier: The values which do not directly identify an individual but when linked with external data set it can lead to identity disclosure as shown in fig. 1. Fig. 1: Quasi identifier linkage example. 3. Sensitive Attribute: The values which a person doesn’t want to disclose or share. For example disease or salary. 4. Non Sensitive Attribute: The details even if leaked won’t harm the individual are non sensitive attribute. Hence in anonymization we remove the identifier field from the data set so that no direct identification of individual can be possible. Then we modify the quasi identifier to prevent from linkage attack before publishing the data. Table 1 shows an example of