DOI: http://dx.doi.org/10.26483/ijarcs.v9i1.5089
Volume 9, No. 1, January-February 2018
International Journal of Advanced Research in Computer Science
RESEARCH PAPER
Available Online at www.ijarcs.info
© 2015-19, IJARCS All Rights Reserved 194
ISSN No. 0976-5697
INTEGRATING BIG DATA IN CLOUD ENVIRONMENT–A REVIEW
Mr.Deepak Ahlawat
PhD Research Scholar MMU Sadopur
Ambala Haryana, India
Dr.Deepali Gupta
HOD CSE MMU Sadopur
Ambala Haryana, India
Abstract: In this paper the concept of the Big Data and Cloud Computing are integrated and reviewed. Big data term refers to huge volume of
data in today’s internet environment, much of which cannot be integrated easily. Cloud computing and big data go hand in hand. Big data gives
the users the ability to utilize massive computing power to process the distributed queries in different datasets and return outcome sets in a
timely manner. Cloud computing is the paradigm on which various resources are spread and with the use of Hadoop these can be utilized
efficiently. Furthermore, the future work of the integration of big data and cloud computing paradigm are also presented.
Keywords: GA, PRF, CURE.
1. INTRODUCTION
1.1. Big Data
Big data [1] can be characterized by 4Vs: the extreme
volume of data, the wide variety of types of data, the
velocity at which the data must be must processed and the
value of the process of discovering huge hidden values from
large datasets with various types and rapid generation. . Big
data term refers to huge volume of data in today’s internet
environment, much of which cannot be integrated easily.
Big data takes huge amount of time and costs/money to get
some useful analysis done on it. As knowledge can only be
drive from a careful analysis of data (Data Mining), thus
several new approaches to storing and analysing data have
emerged. Instead, raw data with extended metadata is
aggregated in a data lake and machine learning and artificial
intelligence (AI) programs use complex algorithms to look
for repeatable patterns [2].Collection of large amount of data
takes place because of the human involvement in the digital
space. The work is being shared stored and managed and
lives online. As an example, approximately several terabytes
of data daily uploaded and viewed on Facebook.
Fig.1.Big Data Classification
This kind of huge data with useful information is known as
big data. Clustering is the capable data mining method using
widely for mining valuable information in the unlabeled
data. From the last few decades, numbers of clustering
algorithms are developed on the basis of a variety of theories
plus applications.
1.2. Cloud Computing
A cloud is a computing process in which services are
dispersed above network by computing processes [3].
Service models consist of three main categories [4]: