Prevention of Walk Based Attack on Social Network
Graphs Using Ant Colony Optimization
Munmun Bhattacharya
Department of Information Technology
Jadavpur University
Kolkata,India
munmun@it.jusl.ac.in
Sandipan Roy
Department of Information Technology
Jadavpur University
Kolkata, India
ton_roy@yahoo.in
privacy breaches many anonymization techniques[4] are
adapted while publishing the Social Network data. Even after
anonymization, several attacks are possible to obtain vital
information and identification of particular users, which is a
threat to the purpose of anonymization and security, so the
information must be protected. Walk based attacks[2] are one
of the most prominent attacks on a social networking graph.
We have proposed an algorithm that prevents walk based
attacks to the large extent yet minimizing the data loss and
thus retaining the data mining quality of the social graph to a
considerable extent. We have used “user characteristics
metric” along with Ant Colony Optimization technique to
anonymize the social network data, maintaining the aforesaid
criteria.
The rest of the paper is organized as follows: in section II
we discuss various types of attacks and their mathematical
notations. In section III and IV we discuss our algorithm and
expected results and analysis respectively. And finally we
conclude in section V.
Abstract: Social network is one of the most impactful innovations
of the last decade. It gives a way to connect millions of people
around the world. Social networking sites sometimes sell their
data to third party organizations for analysis and data mining, as
a result of which, there is a chance that privacy of the users are
compromised. Even after naive anonymization of the social
graph, several attacks are possible to identify the victim and
hence his private information can be extracted. Walk based
attacks are one of the most prominent active attacks on a social
networking graph. Where the attacker creates a set of malicious
nodes before naive anonymization and attaches them to a target
node creating an identifiable subgraph. Then in the naive
anonymized graph it tries to identify the subgraph, if it can do so
then the identity of the victim is compromised. We have proposed
an algorithm that prevents walk based attacks to a large extent
yet minimizing the data loss thus retaining the data mining
quality of the social graph to a considerable extent. We have used
“user characteristics metric” along with Ant Colony
Optimization to anonymize our data maintaining the aforesaid
criteria.
Keywords— Social Graph; Anonymization; Walk Based Attack;
Ant Colony Optimization;
II. STRUCTURAL ATTACK
A. Related works
The first work that addresses the privacy problem for
social graph data was initiated by Backstrom et al.[2] In their
work, the authors consider the scenario where a social graph is
published for data mining purpose. In social graph model, a
node rep-resents an individual and a link represents a
particular type of (sensitive) relationship between two
individuals. They present several attacks on social graphs.
Specifically, the authors emphasize the differences between
active attacks, where the adversary may be able to add nodes
and edges before the publication of a graph, and passive
attacks, where the adversary attacks only an already published
and static graph. To compromise the victim’s privacy, the
authors propose the walk-based and cut-based attacks.
Mathematically walk based attack is
I. INTRODUCTION
Last decade saw the rise of a new paradigm in the field of
intercommunication. Social networking almost all the world is
now under the wings of social network. A social network is a
website on the Internet that brings people together in a central
location to talk, share ideas and interests, or make new friends.
This type of collaboration and sharing of data is often referred
to as social media. It contains huge volumes of data.
Sometimes the data is sensitive and private. But unfortunately,
one of the most prominent ways for the social networking sites
to earn revenue is to the data of its users to third parties like
advertising partners (to get targeted advertisements),
application developers and academic researchers. Who will
mine and analyze the data to take decisions which are directly
or indirectly related to increasing their profit. But privacy is a
major concern while publishing these data for analysis as an
adversary can re-identify a vertex (i.e. an individual), an edge
or labels (or attributes) of a vertex using those published data
and some background knowledge. In order to stop these
978-1-4799-6908-1/15/$31.00 ©2015 IEEE