International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 12 | Dec 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1509 Study Paper On: Ontology-Based Privacy Data Chain Disclosure Discovery Method for Big Data Sonali T. Benke 1 , Devidas S.Thosar 2 , Kishor N. Shedage 3 1 M.E. Student, Computer Engineering, SVIT, Nashik 2 PG Co-ordinator, Computer Engineering, SVIT, Nashik 3 HOD, Computer Engineering, SVIT, Nashik ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - As a new software paradigm, cloud computing provides services dynamically according to user requirements. However, it tends to disclose personal information due to collaborative computing and transparent interactions among SaaS services. We propose a private data disclosure checking method that can be applied to the collaboration interaction process. First, we describe the privacy requirement with ontology and description logic. Second, with dynamic description logic, we validate whether SaaS services are authorized to obtain a user’s privacy attributes, to prevent unauthorized services from obtaining their private data. Third, we monitor authorized SaaS services to guarantee privacy requirements. Therefore, we can prevent users’ private data from being used and propagated illegally. Finally, we propose privacy disclosure checking algorithms and demonstrate their correctness and feasibility by experiments[7]. To meet user's functional requirements, cloud computing and big data have become the most commonly used computing and data resources. Based on analysis, conversion, extraction and refinement for the big data, a disease can be prevented and group behavior can be predicted. However, each user’s private data is also an element in big data. Users must provide private data to the service providers to meet their functional requirements. To gain economic benefits, some SaaS service providers have not been authorized to collect and analyze the user's sensitive private data, as a result, the user’s private data is disclosed. In this paper, we propose a private data chain disclosure discovery method, to prevent a user's sensitive privacy information from being illegally disclosed. Firstly, we measure the similarity degree and cost of the disclosure of the private data. Key Words: Ontology, Privacy Disclosure Detection, Privacy Data Chain, Similarity Metric etc. 1. INTRODUCTION Big data usually include data sets that have sizes that are beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time, with characteristics that consist of volume, variety and velocity.[18] According to statistics, an average of 2 million users per second use the Google search engine; within one second, Facebook users share information more than 4 billion times, and Twitter handles more than 3.4 hundred million tweets per day. The amount of data grows exponentially every year, and threequarters of data are produced by people, for example, a standard American worker contributes 1.8 million MB every year. A large amount of personal privacy data can be mined for commercial purposes by agents. For example, Acxiomac queries more than 5 million personal data of consumers all over the world through data processing and analyzes individual behaviors and psychological tendencies with technologies, some of which are known as data association and logical reasoning. In 2014, Adam Sadilekat University of Rochester and John Krumm in the Microsoft lab predicted a person’s likelihood to reach a location in the future by analyzing the information in the big data, with an accuracy as high as 80%. A mobile application does not protect the location of the big data; as a result, a user's home address and other sensitive information can be disclosed through the triangulation reasoning method. Research shows that user attributes can be found by analyzing group features in a social network. For example, by analyzing a user's Twitter messages, the user's political leanings, consumption habits and other personal preferences can be found. Therefore, how to protect personal privacy information has become a hot research topic with respect to bigdata. 1.1 Objective 1. We can check the private data disclosure chain and the key private data, which can effectively prevent service participants from maliciously disclosing users' private data, increase the service trustworthiness, and provide a basis for a privacy safety-oriented trustworthiness measurement. Detailed contributions are showed as follows. Firstly, we get the relationships among privacy data by the mapping with knowledge ontology, and build the ontology tree. We also measure the similarity degrees, containing property similarity, object similarity and hierarchical similarity. 2. Secondly, we measure the cost of the disclosure of the private data with sensitivity grades and privacy disclosure vector. According to the similarity degree and cost of disclosure, the disclosure chain and key private data are detected in the process of interaction between user and SaaS service.