Uncorrected Author Proof
Journal of Intelligent & Fuzzy Systems xx (20xx) x–xx
DOI:10.3233/JIFS-179016
IOS Press
1
A quantitative and text-based
characterization of big data research
1
2
Vedika Gupta
a,*
, Vivek Kumar Singh
b
, Udayan Ghose
c
and Pankaj Mukhija
d
3
a
Department of Computer Science and Engineering, National Institute of Technology Delhi, Delhi, India 4
b
Department of Computer Science, Banaras Hindu University, Varanasi, India 5
c
University School of Information and Communication Technology, Guru Gobind Singh Indraprastha University,
Dwarka, Delhi, India
6
7
d
Department of Electrical and Electronics Engineering, National Institute of Technology Delhi, Delhi, India 8
Abstract. This paper tries to map the research work carried out in the field of Big Data through a detailed analysis of scholarly
articles published on the theme during 2010-16, as indexed in Scopus. We have collected and analyzed all relevant publications
on Big Data, as indexed in Scopus, through a quantitative as well as textual characterization. The analysis attempts to dwell
into parameters like research productivity, growth of research and citations, thematic trends, top publication sources and
emerging topics in this field. The analytical study also investigates country-wise publications output and impact in terms of
average citations per paper, country-level collaboration patterns, authorship and leading contributors (countries, institutions)
etc. The scholarly publication data is also subjected to a detailed textual analysis method to identify key themes in Big Data
research, disciplinary variations and thematic trends and patterns. The results produce interesting inferences. Quantitative
measures show that there has been a tremendous increase in number of publications related to Big Data during last few
years. Research work in Big Data, though primarily considered a sub-discipline of Computer Science, is now carried out by
researchers in many disciplines. Thematic analysis of publications in Big Data show that it’s a discipline involving research
interest from fields as diverse as Medicine to Social Sciences. The paper also identifies major keywords now associated with
Big Data research such as Cloud Computing, Deep Learning, Social Media and Data Analytics. This helps in a thorough
understanding and visualization of the Big Data research area.
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Keywords: Big data, big data analytics, data science, scientometrics 23
1. Introduction 24
Big Data refers to those datasets that are so large as 25
to pose challenges in storage and analysis via tradi- 26
tional data handling techniques. Big Data Analytics 27
(BDA) is the umbrella term given to the practice 28
of collecting, organizing and analyzing large sets 29
of data (Big Data). Big Data analytics allows orga- 30
nizations to comprehend the information contained 31
*
Corresponding author. Vedika Gupta, Department of Com-
puter Science and Engineering, National Institute of Technology
Delhi, Delhi-110040, India. Tel.: +91 9910172545; E-mail:
vedika.nit@gmail.com.
within the data, in a better and sound manner and also 32
helps in identifying the data that provides insightful 33
knowledge for the current as well as future business 34
decisions. Big Data has percolated into possibly all 35
domains of technology and is gaining huge attention 36
from academia as well as industries and governments. 37
Big Data analytics as a research area has become very 38
important in recent years. It now encompasses all the 39
techniques used to analyze data at large scale span- 40
ning across health care, policy making, astronomy, 41
city planning, education, telecommunications, bank- 42
ing, IT and risk management, advertising, marketing 43
and other strategic business domains. 44
1064-1246/18/$35.00 © 2018 – IOS Press and the authors. All rights reserved