INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 01, JANUARY 2020 ISSN 2277-8616 1089 IJSTR©2020 www.ijstr.org Detection Of Data Leakage In Cloud Storages Naresh Vurukonda, Allu Venkata Dattatreya Reddy, Gutta Chiranjeevi, Kancharla Raviteja Abstract: Leakage of sensitive data may leads to the loss of confidential and integrity. Some of the data may be leaked and found on web or untrusted users. Distributor have to take upon these situations in order to maintain data confidentiality and ensure a safe data transaction. Many small business authorities have data leak issues via internet or other means. We would like to propose a alternative methodology to implement in real world and it is different from traditional methods. Traditional methods contain “watermarking” and in some cases we can also inject “realistic but fake” data records to further improve our chances of detecting leakage and identifying the guilty party. But this also will not work if the guilt agent knows the fake objects. So the other method for getting the guilt agents is to be determined. Many methods have been in existence but every method is being override by other means using complex methodologies and by various combinations of the algorithms. These complex methods would secure much better than older ones. We are finding the agents by taking the parameters like how much time he is spending in the data, how many times he opened that file etc.... we can find the probability if the probability is more than the threshold value then we can conclude that the agent had compromised. In this model we use the previous methods knowledge to predict the agents or to over come in the solution. Keywords : water marking,guilt agent, fake object, probability, automation —————————— —————————— 1. INTRODUCTION In a rapidly growing digital world the sensitive data is being transferred from all parts over the globe. With the increasing concerns over the data transmission many methods are being implemented to prevent the data leakage. Early works consists of the traditional methods like Watermarking and cipher text conversions. These methodologies have been implemented on the text files and many multimedia formats and updated formats of video and audio cannot be compatible with these methods. Many specific companies and business agencies have been compatible with the watermarking methodology and used it to watermark the files transferred over networks. Watermark seems to be a prominent solution for the business model for particular time and has been overcome with time. Watermark’s has been destroyed or removed using the advanced cryptography tools. Later on many methods has been to existence and some of them are able to survive a bit long like Fake object allocation, Agent Guilt model, optimization method, hashing and salted hashing. From the literature based on the previous papers related to these algorithms. We can understand that vulnerabilities can be found from all the methodologies.We could know the probability in an different way such that agent is dependent on its previous activity. All the agents are judged with their previous history of leakage and demand from an agent. Probability is calculated based on their data using Naïve Bayes or related probability algorithm to gain probability. The value is compared to a threshold level of risk and decided by the AI system to allow or not. 2. RELATED WORK The main objective of this project is to find the guilt agents means the agents that leaks the data to the third party users for some financial uses or for some other activity. Actually using the fake objects and the watermarking methods which are use earlier for finding the guilt agent are very old methods. We propose a new method for finding the guilt agents based on the number of times agent access the data and the time duration agent access the data. For this approach we have a designed a flow at which we find the guilt agent even more simple and fast when compared to remaining approaches. For finding the details of the agents like how much time agent is using and accessing the files we have different approaches and also we can design some algorithms, but it takes lot of time and it will not be accurate. So there are some online sites for doing the same purpose in a very accurate way. We have taken sales handy website for this purpose it is meant for the tracking purpose of the files and emails so we have used it. We should upload the data we want to share to the agents and have to generate the link for the following data. The link will be shared to the agents in any of the existing methods that you prefer. After that you can monitor the details of the agents in the websites in your account. The data that was present in the site have to be extracted for the further purpose for that we used automation and create a bot for automatically extracting the data out of the website without any human work. After the data is extracted the next is to calculate the probability based on the time the agent is accessed to the data and the average time the data is accessed with machine learning and for analyzing we used R programming. We will decide a threshold value for every particular data and by using this we will calculate the probability for the agent to be guilty. In this analysis if the probability of the agent to be guilty is high then we mark that particular agent as guilty and we will not forward the data any more to that particular agent. 3. WATERMARKING: Watermarking methodology being implemented on the text files for data leakage detection. The watermark is applied on the various parts of the files and later sends to the requested Agent. On finding the unique watermark at a unauthorized person or a firm can be considered as data ——————————————— Naresh Vurukonda, Assistant professor, Koneru lakshmaiah educational institute, Guntur, India, 9908109980, nareshvurukonda@kluniversity.in Allu venkata Dattatreya Reddy, student, Koneru lakshmaiah educational institute, Guntur, India, 9533650272, dattatreya.allu.4370@gmail.com Gutta Chiranjeevi, student, Koneru lakshmaiah educational institute, Guntur, India, 9491559182, gchitanjeevi1999@gmail.com Kancharla Raviteja, student, Koneru lakshmaiah educational institute, Guntur, India, 9701329957, kancharlaraviteja1999@gmail.com