IJSTE - International Journal of Science Technology & Engineering | Volume 2 | Issue 10 | April 2016 ISSN (online): 2349-784X All rights reserved by www.ijste.org 68 Secrecy Preserving Discovery of Subtle Statistic Contents A.Francis Thivya P.Tharcis PG Scholar Assistant Professor Department of Computer Science & Engineering Department of Computer Science & Engineering Christian College of Engineering and Technology Oddanchatram, Tamilnadu-624619, India Christian College of Engineering and Technology Oddanchatram, Tamilnadu-624619, India Abstract A Data Distributor or Software Company has given sensitive data to a set of confidential data owners. Sometimes data are leaked and found in the unauthorized place. Data leakage happens every day when the sensitive information is transferred to the third party agents. To manage and protect the confidential information, data leak detection is an important part. The exposure of sensitive information in storage and transmission places a serious threat to organizational and individual security. Data leak detection aims at scanning content of exposed sensitive information. Because of the large capacity and data volume, such a cryptographic algorithm needs to be scalable for a timely detection. In this project, the system proposes host based data leak detection (DLD). Data owner calculates a peculiar set of digests or fingerprints from the sensitive information, and then reveals only a small amount of digest data to the DLD provider. The system implements, and evaluates a new privacy-preserving data- leak detection system that changes the data owner to safely deploy locally, or to allocate the traffic-inspection task to DLD providers without revealing the sensitive data. It works well especially when consecutive data blocks are leaked. This host-based DLD technique may improve the accuracy, privacy, concision and efficiency of fuzzy fingerprint data leak detection. Keywords: Data Leak Detection Provider, Fuzzy Fingerprint, Host-Based Data Leak Detection, Semi-Honest Adversary ________________________________________________________________________________________________________ I. INTRODUCTION Information Security has always had an important role as technology has advanced; it has become one of the hottest topics of the last few decades. Information Forensic and Security is the investigation and analysis technique to gather and preserve information from a particular computer device. Information security is the set of business processes that protect information assets. It doesn’t concentrate on how the information is formatted or produced. Information security is very important for maintaining the availability, confidentiality and integrity of the information technology system and business data. Most of the large enterprises employ a dedicated security group to implement and manage the organizations information security program. Detecting and preventing data leak involves a set of solutions including data confinement [6], stealthy malware detection, policy enforcements and data leak detection. Typical approaches to preventing data leaks are under two categories [10] -- i) host-based solution and ii) network-based solution. Network-based data leak detection focus on analyzing unscripted outbound network traffic through i) Deep packet inspection and ii) Information theoretic analysis. In deep packet inspection, inspecting every packet for the occurrence of sensitive data defined in database [1]. Network based data leak detection accompaniments host-based approach. Host based data leak detection typically performs i) encrypt the data. ii) Detecting malware with antivirus scanning the host. iii) Enforcing policies to limiting the transfer of sensitive data [10]. In host-based data leak detection approach, the data owner computes a specialized set of digests from the sensitive data and exposes only a small amount to Data Leak Detection (DLD) providers. The DLD provider computes the fingerprint from network traffic and identifies the potential leaks. The collection of leakage comprise of real leaks and inconsistent data [1], [15]. The data owner perform post-processes to determine whether there is any real leak by sending possible leak to the DLD provider. The data owner uses Robin fingerprint algorithm and a slipping window to generate the one way calculation through the fast polynomial modulus operation [10], [13]. To discover the data leak and accomplish the privacy, data owner generates the set of particular digests called fuzzy fingerprints id. Fuzzy fingerprint is used to hide sensitive data in the crowd or network traffic [10]. The DLD provider performs review on network traffic but the provider may attempt to learn the information about sensitive data. The DLD provider detects leaks by range-based comparison rather than an exact match. The DLD provider is a semi-honest antagonist [1]. The reminder of this paper is organized as follows. Section II, describes the Related Works. Section III, describes the Proposed Work. Section IV, describes the Experimental Evaluation and Results. Section V summarizes the Conclusion and Future Enhancement.