International Journal of Electronic and Electrical Engineering.
ISSN 0974-2174 Volume 7, Number 8 (2014), pp. 773-778
© International Research Publication House
http://www.irphouse.com
Comparative Analysis of Anonymization Techniques
Dilpreet Kaur Arora
1
, Divya Bansal
2
and Sanjeev Sofat
3
1, 2, 3
Computer Science Department,
PEC University of Technology, Chandigarh, India
Abstract
In recent years, privacy-preserving techniques has seen quick advancement
due to rapid increase in storing and maintaining personal data about
individuals. The personal data can be misused, for a variety of purposes.
Maintaining the privacy for high dimensional database has become major
aspect. In order to improve these concerns, a number of Anonymization
techniques have recently been proposed in order to perform privacy-
preservation of data. In this paper, a comparative analysis for K-Anonymity,
L-Diversity and T-Closeness Anonymization techniques is presented for the
high dimensional databases based upon the privacy metric.
Keywords Anonymization, K-anonymity, L-diversity, t-closeness, Attributes.
Introduction
Due to the rapid growth in information technologies, companies at the present time
collect and store huge amounts of information in their databases. Typically, such
information is stored in the form of tables and each record is corresponding to an
individual. Every record has a number of attributes which can be divided into three
categories: 1. Explicit identifiers which can clearly identify individuals. 2. Quasi
Identifying attributes whose values when taken can easily identify individuals
identities. 3. Sensitive Attributes which are considered sensitive and need not be
disclosed[4].
A number of different Anonymization techniques have been researched to protect
the identity of the respondents. Different data holders like often remove or encrypt the
explicit identifiers. While de-identifying the information which does not provide
anonymity, as released information also contains other data called Quasi Identifiers
which can be used for re-identifying the data respondents, thus leaking that
information which is not intended to be disclosed. While releasing the information, it
is necessary to protect the sensitive information of the individuals from being
disclosed. While the released table gives useful information to the researchers, it also