International Conference & Workshop on Recent Trends in Technology, (TCET) 2012 Proceedings published in International Journal of Computer Applications® (IJCA) 1 Data Integrity and Confidentiality in Outsourced Database Sonal Balpande Rajashree Shegde Lata Ragha Ramrao Adik Institute of Technology Ramrao Adik Institute of Tech. Ramrao Adik Institute of Tech. Navi Mumbai Navi Mumbai Navi Mumbai ABSTRACT An increasing number of enterprises outsource their IT functions or business processes to third-parties who offer these services with a lower cost due to the economy of scale. Quality of service has become a major concern in outsourcing. When database owners outsourced their data to service providers, which might be untrusted or compromised, two issues of data security emerge, data confidentiality and data integrity. Most of the previous research focuses on only one issue and the solution to integrate two approaches is expensive. Furthermore, no solution is raised on various character queries. We propose an approach which keeps data confidentiality and integrity. The proposed approach is based on bin checksum, which can be used to check correctness, completeness and freshness of multiple tuples at one time. General Terms Outsourced Database, Database Security Keywords Data Integrity, Data confidentiality, Outsourced database 1. INTRODUCTION Database outsourcing, usually referred to as Database As a Service [1] the external service provider provides mechanisms for clients to access the outsourced databases. A major advantage of database outsourcing is related to the high costs of in-house versus outsourced hosting. Outsourcing provides significant cost savings and promises higher availability and more effective disaster protection than in-house operations. On the other hand, database outsourcing poses a major security problem, due to the fact that the external service provider, which is relied upon for ensuring high availability of the outsourced database (i.e., it is trustworthy), cannot always be trusted with the confidentiality of database content. There are three main entities in the Outsourced Database (ODB) model: A User poses the query to the client. A Server is hosted by the service provider who stores the encrypted database. Client also maintains metadata for translating user queries to the appropriate representation on the server and performs post processing on server query result. It is assumed that the data owner may up-date the database periodically or occasionally, and that the data management and retrieval happens only at the servers. Existing proposals in the data outsourcing area typically support data confidentiality or data integrity and it support limited types of queries. Further no scheme support authentication on character data query. In order to solve above problems we propose the scheme which keep data confidentiality and mean while guarantee data integrity on numeric and character query. The remainder of the paper is organized as follows section 2 gives an overview of the existing proposals. Section 3 presents the System Model for enforcing Data confidentiality and integrity in the outsourced database. Authentication of range aggregation and character based queries explained in section 4. Finally, section 5 concludes the paper. 2. LITERATURE SURVEY Query Authentication should verify data correctness, completeness and freshness. Three signature-based approaches were proposed: tuple level signature, aggregated signature (AS) [2] and signature chaining [3]. In tuple level signature, each tuple is assigned with a signature and verified based on this signature. In AS, when t signatures s 1 ; : : : ; s t on t messages m 1 ; : : : ;m t signed by the same signer need to be verified all at once, certain signature schemes allow for more efficient communication and verification than t individual signatures. In signature chaining, a signature is signed on three consecutive tuples, i.e. si =sign (ti−1|ti |ti+1). Note that t 1 . . . t N are sorted on some query predicate. Two special records t 0 and t N+1 are added for the signature of tuple t 1 and t N . In authenticated data structures, tuples are organized into a tree such that one signature to the root node can guarantee the data integrity of other nodes in the tree. MHT is a main- memory binary index tree, where each leaf node contains the hash of a tuple, and each internal node contains the hash of the concatenation of its two children. Based on MHT, several disk-based dynamic variants have been proposed [4],[5].Since one MHT is built on one attribute, to support authentication of multi-dimensional range queries, multiple MHTs are required to be constructed. Table 1: Overview of existing Model Confidentiality Integrity Queries Range Aggreg ation Char acter AS _ √ √ _ _ MHT _ √ √ _ _ SAE _ √ √ _ _ BBA √ √ √ √ _ CMCIS √ _ _ _ √ DIC √ √ √ √ √ SAE [6] separates authentication from query execution by exploiting trustworthy organizations with expertise on security issue. Such an organization is referred as a trusted entity (TE). Although a TE possesses up-to-date resources and know-how on security standards, cryptographic libraries, etc., it does not necessarily have the infrastructure to manage large databases and high query loads. Therefore, SAE assigns to the TE only the authentication process, which involves little computational effort compared to the actual query processing performed at the service provider. Clients issue queries directly to the SP, which sends back only the results. Bucket based authentication [7] is based on bucket checksum, which can be used for the authentication of multiple tuples. CMCIS [8] (Character Mapping Cipher Index Scheme), allows various key word and fuzzy queries. But does not provide integrity of data. Table 1 gives an overview of existing model