International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 12 | Dec 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1509
Study Paper On: Ontology-Based Privacy Data Chain Disclosure
Discovery Method for Big Data
Sonali T. Benke
1
, Devidas S.Thosar
2
, Kishor N. Shedage
3
1
M.E. Student, Computer Engineering, SVIT, Nashik
2
PG Co-ordinator, Computer Engineering, SVIT, Nashik
3
HOD, Computer Engineering, SVIT, Nashik
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - As a new software paradigm, cloud computing
provides services dynamically according to user requirements.
However, it tends to disclose personal information due to
collaborative computing and transparent interactions among
SaaS services. We propose a private data disclosure checking
method that can be applied to the collaboration interaction
process. First, we describe the privacy requirement with
ontology and description logic. Second, with dynamic
description logic, we validate whether SaaS services are
authorized to obtain a user’s privacy attributes, to prevent
unauthorized services from obtaining their private data.
Third, we monitor authorized SaaS services to guarantee
privacy requirements. Therefore, we can prevent users’ private
data from being used and propagated illegally. Finally, we
propose privacy disclosure checking algorithms and
demonstrate their correctness and feasibility by
experiments[7].
To meet user's functional requirements, cloud computing and
big data have become the most commonly used computing and
data resources. Based on analysis, conversion, extraction and
refinement for the big data, a disease can be prevented and
group behavior can be predicted. However, each user’s private
data is also an element in big data. Users must provide private
data to the service providers to meet their functional
requirements. To gain economic benefits, some SaaS service
providers have not been authorized to collect and analyze the
user's sensitive private data, as a result, the user’s private data
is disclosed. In this paper, we propose a private data chain
disclosure discovery method, to prevent a user's sensitive
privacy information from being illegally disclosed. Firstly, we
measure the similarity degree and cost of the disclosure of the
private data.
Key Words: Ontology, Privacy Disclosure Detection, Privacy
Data Chain, Similarity Metric etc.
1. INTRODUCTION
Big data usually include data sets that have sizes that are
beyond the ability of commonly used software tools to
capture, curate, manage, and process data within a tolerable
elapsed time, with characteristics that consist of volume,
variety and velocity.[18] According to statistics, an average
of 2 million users per second use the Google search engine;
within one second, Facebook users share information more
than 4 billion times, and Twitter handles more than 3.4
hundred million tweets per day. The amount of data grows
exponentially every year, and threequarters of data are
produced by people, for example, a standard American
worker contributes 1.8 million MB every year. A large
amount of personal privacy data can be mined for
commercial purposes by agents. For example, Acxiomac
queries more than 5 million personal data of consumers all
over the world through data processing and analyzes
individual behaviors and psychological tendencies with
technologies, some of which are known as data association
and logical reasoning.
In 2014, Adam Sadilekat University of Rochester and John
Krumm in the Microsoft lab predicted a person’s likelihood
to reach a location in the future by analyzing the information
in the big data, with an accuracy as high as 80%. A mobile
application does not protect the location of the big data; as a
result, a user's home address and other sensitive
information can be disclosed through the triangulation
reasoning method. Research shows that user attributes can
be found by analyzing group features in a social network. For
example, by analyzing a user's Twitter messages, the user's
political leanings, consumption habits and other personal
preferences can be found. Therefore, how to protect
personal privacy information has become a hot research
topic with respect to bigdata.
1.1 Objective
1. We can check the private data disclosure chain and the key
private data, which can effectively prevent service
participants from maliciously disclosing users' private data,
increase the service trustworthiness, and provide a basis for a
privacy safety-oriented trustworthiness measurement.
Detailed contributions are showed as follows. Firstly, we get
the relationships among privacy data by the mapping with
knowledge ontology, and build the ontology tree. We also
measure the similarity degrees, containing property
similarity, object similarity and hierarchical similarity.
2. Secondly, we measure the cost of the disclosure of the
private data with sensitivity grades and privacy disclosure
vector. According to the similarity degree and cost of
disclosure, the disclosure chain and key private data are
detected in the process of interaction between user and SaaS
service.