Journal of Convergence Information Technology Volume 5, Number 7, September 2010 A Study on Secure Data Storage Strategy in Cloud Computing 1 Danwei Chen, 2 Yanjun He 1, First Author College of Computer Technology, Nanjing University of Posts and Telecommunications, chendw@njupt.edu.cn *2,Corresponding Author College of Computer Technology, Nanjing University of Posts and Telecommunications,realmeh@gmail.com doi: 10.4156/jcit.vol5.issue7.23 Abstract Based on fundamental theories of k equations in algebra, n congruence surplus principle in elementary number theory, and the Abhishek’s online data storage algorithm, we propose a secure data storage strategy in cloud computing. The strategy splits data d into k sections using the data splitting algorithm, ensures high data security by simplifying k equation solutions, and at the same time, guarantees highly reliable data using the coefficients generated by the splitting algorithm. Keywords: Cloud computing, Data partitioning, Distributed storage, Security strategy 1. Introduction Cloud computing mainly provides three kinds of services: IaaS (Infrastructure as a Service), PaaS (Platform as a Service) and SaaS (Software as a Service) [1]. The major difference between service based on cloud computing and traditional service is that user data is stored not in the local server, but in the distributed storage system of the service supplier. In many cases, however, users (especially business users) have high demands regarding data security and reliability. Generally, in traditional data protection methods, plaintext data is stored after encryption. In practical applications, symmetric encryption algorithms, such as DES and AES, are usually adopted because of their efficiency. Although data stored in the cloud server are encrypted, encryption algorithm provides relatively lower security. Therefore, encrypted data are very likely to be vulnerable to attacks [2] and business interests become compromised once the server is invaded. In this paper, we propose a secure data storage strategy capable of addressing the shortcomings of traditional data protection methods and improving security and reliability in cloud computing. 2. Data security storage strategies Secure data storage in cloud computing is realized on the basis of a distributed system. After reaching the cloud, data can be randomly stored in any one or more servers. According to characteristics of the storage mode, each server in the distributed system can be abstracted as a storage node. Suppose there are m servers in the system, written as: S={ 1 s , 2 s ,.., m s }. Suppose the plaintext data set is d. The k equations based on the splitting algorithm is applied to data set d to generate k(k<m) data, written as:{ 1 d , 2 d ,.., k d } = Partition(d) in which Partition() is the data splitting algorithm illustrated in detail in Section 3 of this paper. The generated data blocks are then split, and k servers are randomly chosen out of m servers, which can be expressed as the following formula:{ 1 d , 2 d ,.., k d }=map(S), where S={ 1 s , 2 s ,.., m s }. The data restoration process can be expressed as p d d d d k mod 2 1 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ = , where p is a large prime number. The core of the secure storage strategy is its data splitting algorithm, which is an extension of fundamental theories of k equations in algebra, n congruence surplus principle in elementary number theory[3], key sharing of Shamir[4] and online data storage algorithm of Abhishek[5,6], through which data splitting storage is realized. The safety of the strategy mainly depends on two aspects. First, is the difficulty of decoding the data splitting algorithm. The second, is that because storage servers are randomly chosen after data splitting, encrypted data cannot be completely obtained by attacking one or more servers, making decoding even more difficult. 175