Secure Top-k Query Processing on Encrypted Databases Xianrui Meng 1 , Haohan Zhu 1 , and George Kollios 1 1 Department of Computer Science Boston University Abstract Privacy concerns in outsourced cloud databases have become more and more important recently and many efficient and scalable query processing methods over encrypted data have been proposed. However, there is very limited work on how to securely process top-k ranking queries over encrypted databases in the cloud. In this paper, we focus exactly on this problem: secure and efficient processing of top-k queries over outsourced databases. In particular, we propose the first efficient and provable secure top-k query processing construction that achieves adaptively IND-CQA security. We develop an encrypted data structure called EHL and describe several secure sub-protocols under our security model to answer top-k queries. Furthermore, we optimize our query algorithms for both space and time efficiency. Finally, in the experiments, we empirically analyze our protocol using real world datasets and demonstrate that our construction is efficient and practical. 1 Introduction As remote storage and cloud computing services emerge, such as Amazon’s EC2, Google AppEngine, and Microsoft’s Azure, many enterprises, organizations, and end users may outsource their data to those cloud service providers for reliable maintenance, lower cost, and better performance. In fact, a number of database systems on the cloud have been developed recently that offer high availability and flexibility at relatively low costs. However, despite these benefits, there are still a number of reasons that make many users to refrain from using these services, especially users with sensitive and valuable data. Undoubtedly, the main issue for this is related to security and privacy concerns [3]. Indeed, data owner and clients may not fully trust a public cloud since some of hackers, or the cloud’s administrators with root privilege can fully access all data for any purpose. Sometimes the cloud provider may sell its business to an untrusted company, which will have full access to the data. One approach to address these issues is to encrypt the data before outsourcing them to the cloud. For example, electronic health records (EHRs) should be encrypted before outsourcing in compliance with regulations like HIPAA 1 . Encrypted data can bring an enhanced security into the Database-As-Service environment [20]. However, it also introduces significant difficulties in querying and computing over these data. In recent years, many works have been proposed for computing over encrypted data. In general, the main technical difficulty is to query an encrypted database without ever having to decrypt it. A number of techniques related to practical query processing over encrypted data have been proposed recently, including keyword search queries [41, 11, 13], range queries [40, 22, 30], k-nearest neighbor queries [45, 16, 47, 12], as well as other aggregate queries. Surprisingly, to the best of our knowledge, although top-k queries are important query types in many database applications [23], none of the existing works are applicable to Email:xmeng@cs.bu.edu Email:zhu@cs.bu.edu Email:gkollios@cs.bu.edu 1 HIPAA is the federal Health Insurance Portability and Accountability Act of 1996. 1 arXiv:1510.05175v2 [cs.CR] 3 Dec 2015