International Journal of Computer Applications (0975 – 8887) Volume 87 – No.15, February 2014 9 Privacy Preserving Techniques in Social Networks Data Publishing-A Review Amardeep Singh Ph.D. Scholar Department of Computer Science PEC University of Technology Chandigarh, India Divya Bansal, Ph.D Associate Professor Department of Computer Science PEC University of Technology, Chandigarh, India Sanjeev Sofat, Ph.D Professor Department of Computer Science PEC University of Technology, Chandigarh, India ABSTRACT Development of online social networks and publication of social network data has led to the risk of leakage of confidential information of individuals. This requires the preservation of privacy before such network data is published by service providers. Privacy in online social networks data has been of utmost concern in recent years. Hence, the research in this field is still in its early years. Several published academic studies have proposed solutions for providing privacy of tabular micro-data. But those techniques cannot be straight forwardly applied to social network data as social network is a complex graphical structure of vertices and edges. Techniques like k-anonymity, its variants, L-diversity have been applied to social network data. Integrated technique of K-anonymity & L-diversity has also been developed to secure privacy of social network data in a better way. General Terms Social Network, Anonymization, Privacy, Attacks, Attributes. Keywords Privacy models, K-anonymity, L-diversity, t-closeness. 1. INTRODUCTION Due to the increase in popularity of online social networks on the Web [1], large number of people subscribe to social networks or social media. This has generated large amount of user data that is gathered and maintained by the social network service providers. The data generated by social network services is termed as the social network data that needs to be published for others in certain situations. One of the situations is when specific analysis of the user data needs to be done and another situation is when the owner of the data has to share the data with third parties like advertising partners which is part of policies generally accepted by subscribers. The data contains valuable information about users that helps third parties in better social targeting of advertisements. Social network analysis is being used in modern sociology, geography, economics, and information sciences [2]. Researchers in various fields use this data for different purposes like researchers in government institutions require social network data for information and security purposes [3]. So, data needs to be shared or published in all above mentioned situations. Owner of data can publish it for others to analyze but it may create serious privacy threats. To fulfill the demands for the network data, online social media operators have been sharing the data they gather and maintain with external third parties such as advertisers, application developers, and academic researchers like Facebook has thousands of third-party applications and there has been an exponential increase in this number [4]. Social network data contains sensitive and confidential information about the users [5-7]. Thus sharing of this data in its raw form may breach privacy of individuals. Individual privacy is defined as “the right of the individual to decide what information about himself should be communicated to others and under what circumstances” [8]. A privacy breach occurs when private and confidential information about the user is disclosed to an adversary. So, preserving privacy of individuals while publishing user’s collected data is an important research area. Work has been done by various researchers in this direction. This paper is structured as follows: Section 2 describes categories of privacy breach; followed by challenges in preserving privacy in social networks data which have been briefed in Section 3; Section 4 presents exiting techniques for preserving privacy in tabular micro-data; techniques for preserving privacy in social networks has been covered in Section 5; Section 6 gives research directions for new researchers; finally Section 7 concludes the review. 2. CATEGORIES OF PRIVACY BREACH The privacy breaches in social networks can be categorized into three types [9-10]: i. Identity disclosure - Identity disclosure occurs when an individual behind a record is exposed. This type of breach leads to the revelation of information of a user and relationship he/she shares with other individuals in the network. ii. Sensitive link disclosure - Sensitive link disclosure occurs when the associations between two individuals are revealed. Social activities generate this type of information when social media services are utilized by users. iii. Sensitive attribute disclosure – Sensitive attribute disclosure takes place when an attacker obtains the information of a sensitive and confidential user attribute. Sensitive attributes may be linked with an entity and link relationship. All these mentioned privacy breaches pose severe threats like stalking, blackmailing and robbery because users expect privacy of their data from the service provider end. Besides that it damages the image and reputation of an individual. There are many examples of accidental disclosure of private information of users’ data that causes organizations to be conservative in releasing the network data, such as the AOL search data example [11] and attacks on Netflix data [12]. As