Prevention of Walk Based Attack on Social Network Graphs Using Ant Colony Optimization Munmun Bhattacharya Department of Information Technology Jadavpur University Kolkata,India munmun@it.jusl.ac.in Sandipan Roy Department of Information Technology Jadavpur University Kolkata, India ton_roy@yahoo.in privacy breaches many anonymization techniques[4] are adapted while publishing the Social Network data. Even after anonymization, several attacks are possible to obtain vital information and identification of particular users, which is a threat to the purpose of anonymization and security, so the information must be protected. Walk based attacks[2] are one of the most prominent attacks on a social networking graph. We have proposed an algorithm that prevents walk based attacks to the large extent yet minimizing the data loss and thus retaining the data mining quality of the social graph to a considerable extent. We have used “user characteristics metric” along with Ant Colony Optimization technique to anonymize the social network data, maintaining the aforesaid criteria. The rest of the paper is organized as follows: in section II we discuss various types of attacks and their mathematical notations. In section III and IV we discuss our algorithm and expected results and analysis respectively. And finally we conclude in section V. Abstract: Social network is one of the most impactful innovations of the last decade. It gives a way to connect millions of people around the world. Social networking sites sometimes sell their data to third party organizations for analysis and data mining, as a result of which, there is a chance that privacy of the users are compromised. Even after naive anonymization of the social graph, several attacks are possible to identify the victim and hence his private information can be extracted. Walk based attacks are one of the most prominent active attacks on a social networking graph. Where the attacker creates a set of malicious nodes before naive anonymization and attaches them to a target node creating an identifiable subgraph. Then in the naive anonymized graph it tries to identify the subgraph, if it can do so then the identity of the victim is compromised. We have proposed an algorithm that prevents walk based attacks to a large extent yet minimizing the data loss thus retaining the data mining quality of the social graph to a considerable extent. We have used “user characteristics metric” along with Ant Colony Optimization to anonymize our data maintaining the aforesaid criteria. Keywords— Social Graph; Anonymization; Walk Based Attack; Ant Colony Optimization; II. STRUCTURAL ATTACK A. Related works The first work that addresses the privacy problem for social graph data was initiated by Backstrom et al.[2] In their work, the authors consider the scenario where a social graph is published for data mining purpose. In social graph model, a node rep-resents an individual and a link represents a particular type of (sensitive) relationship between two individuals. They present several attacks on social graphs. Specifically, the authors emphasize the differences between active attacks, where the adversary may be able to add nodes and edges before the publication of a graph, and passive attacks, where the adversary attacks only an already published and static graph. To compromise the victim’s privacy, the authors propose the walk-based and cut-based attacks. Mathematically walk based attack is I. INTRODUCTION Last decade saw the rise of a new paradigm in the field of intercommunication. Social networking almost all the world is now under the wings of social network. A social network is a website on the Internet that brings people together in a central location to talk, share ideas and interests, or make new friends. This type of collaboration and sharing of data is often referred to as social media. It contains huge volumes of data. Sometimes the data is sensitive and private. But unfortunately, one of the most prominent ways for the social networking sites to earn revenue is to the data of its users to third parties like advertising partners (to get targeted advertisements), application developers and academic researchers. Who will mine and analyze the data to take decisions which are directly or indirectly related to increasing their profit. But privacy is a major concern while publishing these data for analysis as an adversary can re-identify a vertex (i.e. an individual), an edge or labels (or attributes) of a vertex using those published data and some background knowledge. In order to stop these 978-1-4799-6908-1/15/$31.00 ©2015 IEEE