Privacy-preserving Ranked Multi-Keyword Search Leveraging Polynomial Function in Cloud Computing Yanzhi Ren 1 , Yingying Chen 1 , Jie Yang 2 , Bin Xie 3 1 Department of ECE, Stevens Institute of Technology, Hoboken, NJ 07030 {yren2, yingying.chen}@stevens.edu 2 Department of CS, Florida State University, Tallahassee, FL 32306 jyang5@fsu.edu 3 InfoBeyond Technology LLC, Louisville, KY 40223 Bin.Xie@InfoBeyonds.com Abstract—The rapid deployment of cloud computing provides users with the ability to outsource their data to public cloud for economic savings and flexibility. To protect data privacy, users have to encrypt the data before outsourcing to the cloud, which makes the data utilization, such as data retrieval, a challenging task. It is thus desirable to enable the search service over encrypted cloud data for supporting effective and efficient data retrieval over a large number of data users and documents in the cloud. Existing approaches on encrypted cloud data search either focus on single keyword search or become inefficient when a large amount of documents are present, and thus have little support for the efficient multi-keyword search. In this paper, we propose a light-weight search approach that supports efficient multi-keyword ranked search in cloud computing system. Specifically, we first propose a basic scheme using polynomial function to hide the encrypted keyword and search patterns for efficient multi-keyword ranked search. To enhance the search privacy, we propose a privacy-preserving scheme which utilizes the secure inner product method for protecting the privacy of the searched multi-keywords. We analyze the privacy guarantee of our proposed scheme and conduct extensive experiments based on the real-world dataset. The experiment results demonstrate that our scheme can enable the encrypted multi-keyword ranked search service with high efficiency in cloud computing. I. I NTRODUCTION Cloud computing becomes more and more popular and plays an increasingly important role in our daily lives. In particular, cloud users can remotely outsource their data into the cloud and enjoy the on-demand services from the shared computing resources [1]. Cloud computing brings users with many benefits such as the relief of the storage load and flexible data access, which motivate users to store their local data into the cloud. As the cloud services become prevalent, more and more sensitive information, such as personal photos, government records and finance data, are outsourced into the cloud. To protect the privacy of the sensitive data in the cloud, the data has to be encrypted by the data owner before outsourcing to the cloud [2]. However, data encryption makes effective data utilization a challenging task when a large amount of data files are present: users may have to download the whole data set from the cloud and then decrypt it to conduct keyword search over the data, which is very inefficient when the number of data files is large. Thus, effective keyword searching over encrypted data is of paramount importance, especially need to provide efficient ranked multiple keyword search, which supports a set of input keywords and achieves high efficiency simultaneously in user’s search behaviors. Nevertheless, enabling the keyword search over encrypted data is not an easy task. Some techniques [3]–[5] allow the user to search over encrypted data securely through single keyword to retrieve documents of interest. This is insufficient as many users may tend to provide multiple keywords instead of one as their search interest. Recently, methods have been proposed for multiple keyword search in cloud computing [6], [7]. In these methods, a binary index vector needs to be built for each document and each bit denotes whether the corresponding keyword is included in the document. The storing and updating index can be of substantial overhead, especially when the number of keywords is large. Thus, the efficiency of secure multiple keyword search has large room for improvement for enhancing the system usability in cloud computing. In this paper, we perform multi-keyword search over encrypted data in clouds leveraging polynomial functions. Specifically, we exploit the number of query keywords appear- ing in the document index to evaluate the similarity between the query and the document. Our scheme eliminates the pre- defined binary index vector used in existing multiple keyword search scheme [6] and enables efficient index update, making it scalable to a large number of searching keywords. To meet the challenge of keyword search without privacy leakage, we first propose a multi-keyword search scheme by exploiting the polynomial functions to hide the encrypted keywords. With this approach, the search query can be described as the coefficient vector of polynomial functions which can prevent the adversary from learning the input keywords. To combat the adversary equipped with powerful computation resources, we integrate our polynomial function based approach with the existing secure inner product scheme adapted from the secure k-nearest neighbor (kNN) technique [8]. To validate U.S. Government work not protected by U.S. copyright Globecom 2014 - Communication and Information System Security Symposium 594