Secure Authorized De-Duplication in Hybrid Cloud 1 Mr. P. Sasibhushana Rao, 2 B Tejesvee, 3 A. Maithri Varshini, 4 A. Ganesh, 5 A. Praneetha 1 Assistant Professor, Department of Computer Science and Engineering 2,3,4,5 Under Graduate, Department of Computer Science and Engineering, Aditya Institute of Technology and Management, Srikakulam, Andhra Pradesh, India ABSTRACT: Data de-duplication most widely used in cloud computing. It makes data management scalable and resolves the storage problem in cloud computing. Data de-duplication is one of important data compression techniques for eliminating duplicate copies of repeating data, and has been widely used in cloud storage to reduce the amount of storage space and save bandwidth. To protect the confidentiality of sensitive data while supporting de-duplication, the AES (Advanced Encryption Standard) encryption technique has been used to encrypt the data before outsourcing. To protect data security, this is the attempt to formally address the problem of authorized data de-duplication. Different from traditional de-duplication systems, the differential privileges of users are further considered in duplicate check besides the data itself. The de-duplication algorithm eliminates the extra copies by saving just one copy of the data and replacing the other copies with pointers that lead back to the original copy. It is data compression technique for improve the storage utilization. We implement a prototype of our proposed authorized duplicate check scheme and conduct test cases using our prototype. The project aims that the authorized duplicate check scheme incurs minimal overhead compared to normal operations. Index Terms- confidentiality, cloud, data security I. INTRODUCTION To make data management scalable in cloud computing, de-duplication has been a well-known technique and has attracted more and more attention recently. Data de-duplication is a specialized data compression technique for eliminating duplicate copies of repeating data in storage. The technique is used to improve storage utilization and can also be applied to network data transfers to reduce the number of bytes that must be sent. Instead of keeping multiple data copies with the same content, de-duplication eliminates redundant data by keeping only one physical copy and referring other redundant data to that copy. De-duplication can take place at either the file level or the block level. For file level de-duplication, it eliminates duplicate copies of the same file. De-duplication can also take place at the block level, which eliminates duplicate blocks of data that occur in non-identical files. Although data de-duplication brings a lot of benefits, security and privacy concerns arise as user’s sensitive data are susceptible to both insider and outsider attacks. Traditional encryption, while providing data confidentiality, is incompatible with data de- duplication. Specifically, traditional encryption requires different users to encrypt their data with their own keys. Thus, identical data copies of different users will lead to different cipher texts, making de-duplication impossible. Cloud computing is a technique which is most widely used now a days. In that, computing is done over the large communication network like Internet. It is an important solution for business storage in low cost. Cloud computing provide vast storage in all sector like government, enterprise, also for storing our personal data on cloud. Without background implementation details, Platform user can access and share different resources on cloud. The most important problem in cloud computing is that large amount of storage space and security issues. One critical challenge of cloud storage is management of ever-increasing volume of data. To improve scalability, storage problem data de-duplication is most important technique and has attracted more attention recently. It is an important technique for data compression. It simply avoids the duplicate copies of data and store single copy of data. In file level data de-duplication approach, duplicate files are eliminated and in block level data de-duplication approach, duplicate blocks of data that occur in non-identical files. De-duplication reduces the storage needs by up to 90-95% for backup application, 68% in standard file system. For uploading file to cloud user first generate convergent key, encryption of file then load file to the cloud. To prevent unauthorized access proof of ownership protocol is used to provide proof that the user indeed owns the same file when de-duplication found. After the proof, server provides a pointer to subsequent user for accessing same file without needing to upload same file. When user wants to download file he simply downloads an encrypted file from cloud and decrypt this file using convergent key. II. LITERATURE SURVEY P. Anderson et al. [2] proposed Fast and Secure Laptop Backups with Encrypted De-duplication. Here an algorithm and prototype software where data is encrypted independently without invalidating the de-duplication. Bellare et al. [3] showed how to protect the data confidentiality by transforming the predictable message into unpredictable message. In their system, another third party called key server is introduced to generate the file tag for duplicate check. Bellare et al. [4] proposed a new cryptographic primitive, Message-Locked Encryption (MLE), where the key under which encryption and decryption are performed is itself derived from the message. MLE provides a way to achieve secure de-duplication (space-efficient secure outsourced storage), a goal currently targeted by numerous cloud-storage providers. Mukt Shabd Journal Volume IX Issue V, MAY/2020 Issn No : 2347-3150 Page No :4317