International Journal of Advances in Applied Sciences (IJAAS) Vol. 7, No. 1, March 2018, pp. 21~28 ISSN: 2252-8814, DOI: 10.11591/ijaas.v7.i1.pp21-28 21 Journal homepage: http://iaescore.com/online/index.php/IJAAS Data Partitioning in Mongo DB with Cloud Aakanksha Jumle, Swati Ahirrao Computer Science Symbiosis Institute of Technology, Lavale, Pune, India Article Info ABSTRACT Article history: Received May 23, 2017 Revised Dec 27, 2017 Accepted Feb 18, 2018 Cloud computing offers various and useful services like IAAS, PAAS SAAS for deploying the applications at low cost. Making it available anytime anywhere with the expectation to be it scalable and consistent. One of the technique to improve the scalability is Data partitioning. The alive techniques which are used are not that capable to track the data access pattern. This paper implements the scalable workload-driven technique for polishing the scalability of web applications. The experiments are carried out over cloud using NoSQL data store MongoDB to scale out. This approach offers low response time, high throughput and less number of distributed transaction. The result of partitioning technique is conducted and evaluated using TPC-C benchmark. Keywords: Data partitioning Distributed transaction Perofrmance Scalable Workload-Driven TPC-C benchmark Copyright © 2018 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Aakanksha Jumle, Computer Science Symbiosis Institute of Technology, Lavale, Pune, India. Email: aakanksha.jumle@sitpune.edu.in 1. INTRODUCTION In present world, there is huge widening of data due to storage, transfer, sharing of structured and unstructured data which inundates to business. E-commerce sites and application produce huge and complex data which is termed as Big Data. It is mature term that evoke large amount of unstructured, semi-structured and structured data. The cloud computing furnish with the stable platform for vital, economical and efficient organisation of data for operating it. In order to handle and store these huge data, a large database is needed. To cope up with largescale data management system (DBMS) would not support the system. Relational databases were not capable with the scale and swiftness challenges that face modern applications, nowhere they built to take benefit of the commodity storage and computing the power available currently. NoSQL is called as Not only SQL as it partially supports SQL. These data stores are rapidly used in Big Data and in many web applications. NoSQL is basically useful for the data which is unstructured to store. Unstructured data is growing rapidly than structured data and does not fit the relational schemas of RDBMS. Hence the NoSQL [1] data stores get introduced with high availability, high scalability and its consistency. NoSQL database is widely used to process heavy data and web application. Nowadays most of the companies are shifting to NoSQL database [1-3] for their flexibility and ability to scale out, to handle bulky unstructured data in contrast with relational database. NoSQL cloud data stores are developed that are document store, Key-value, column family, graph database, etc. NoSQL data stores comprise its advantages for coping with the vast load of data with the aid of scale out applications. The techniques which are in use are classified into static [4-5] and dynamic partitioning [6] systems. In static partitions, the related data item are put on single partition for accessing the data, and once the partitions formed do not change further. The advantage of static partition creation, no migration of data is done so as the cost of data migration is negligible.