DOI: http://dx.doi.org/10.26483/ijarcs.v9i1.5089 Volume 9, No. 1, January-February 2018 International Journal of Advanced Research in Computer Science RESEARCH PAPER Available Online at www.ijarcs.info © 2015-19, IJARCS All Rights Reserved 194 ISSN No. 0976-5697 INTEGRATING BIG DATA IN CLOUD ENVIRONMENT–A REVIEW Mr.Deepak Ahlawat PhD Research Scholar MMU Sadopur Ambala Haryana, India Dr.Deepali Gupta HOD CSE MMU Sadopur Ambala Haryana, India Abstract: In this paper the concept of the Big Data and Cloud Computing are integrated and reviewed. Big data term refers to huge volume of data in today’s internet environment, much of which cannot be integrated easily. Cloud computing and big data go hand in hand. Big data gives the users the ability to utilize massive computing power to process the distributed queries in different datasets and return outcome sets in a timely manner. Cloud computing is the paradigm on which various resources are spread and with the use of Hadoop these can be utilized efficiently. Furthermore, the future work of the integration of big data and cloud computing paradigm are also presented. Keywords: GA, PRF, CURE. 1. INTRODUCTION 1.1. Big Data Big data [1] can be characterized by 4Vs: the extreme volume of data, the wide variety of types of data, the velocity at which the data must be must processed and the value of the process of discovering huge hidden values from large datasets with various types and rapid generation. . Big data term refers to huge volume of data in today’s internet environment, much of which cannot be integrated easily. Big data takes huge amount of time and costs/money to get some useful analysis done on it. As knowledge can only be drive from a careful analysis of data (Data Mining), thus several new approaches to storing and analysing data have emerged. Instead, raw data with extended metadata is aggregated in a data lake and machine learning and artificial intelligence (AI) programs use complex algorithms to look for repeatable patterns [2].Collection of large amount of data takes place because of the human involvement in the digital space. The work is being shared stored and managed and lives online. As an example, approximately several terabytes of data daily uploaded and viewed on Facebook. Fig.1.Big Data Classification This kind of huge data with useful information is known as big data. Clustering is the capable data mining method using widely for mining valuable information in the unlabeled data. From the last few decades, numbers of clustering algorithms are developed on the basis of a variety of theories plus applications. 1.2. Cloud Computing A cloud is a computing process in which services are dispersed above network by computing processes [3]. Service models consist of three main categories [4]: