Uncorrected Author Proof Journal of Intelligent & Fuzzy Systems xx (20xx) x–xx DOI:10.3233/JIFS-179016 IOS Press 1 A quantitative and text-based characterization of big data research 1 2 Vedika Gupta a,* , Vivek Kumar Singh b , Udayan Ghose c and Pankaj Mukhija d 3 a Department of Computer Science and Engineering, National Institute of Technology Delhi, Delhi, India 4 b Department of Computer Science, Banaras Hindu University, Varanasi, India 5 c University School of Information and Communication Technology, Guru Gobind Singh Indraprastha University, Dwarka, Delhi, India 6 7 d Department of Electrical and Electronics Engineering, National Institute of Technology Delhi, Delhi, India 8 Abstract. This paper tries to map the research work carried out in the field of Big Data through a detailed analysis of scholarly articles published on the theme during 2010-16, as indexed in Scopus. We have collected and analyzed all relevant publications on Big Data, as indexed in Scopus, through a quantitative as well as textual characterization. The analysis attempts to dwell into parameters like research productivity, growth of research and citations, thematic trends, top publication sources and emerging topics in this field. The analytical study also investigates country-wise publications output and impact in terms of average citations per paper, country-level collaboration patterns, authorship and leading contributors (countries, institutions) etc. The scholarly publication data is also subjected to a detailed textual analysis method to identify key themes in Big Data research, disciplinary variations and thematic trends and patterns. The results produce interesting inferences. Quantitative measures show that there has been a tremendous increase in number of publications related to Big Data during last few years. Research work in Big Data, though primarily considered a sub-discipline of Computer Science, is now carried out by researchers in many disciplines. Thematic analysis of publications in Big Data show that it’s a discipline involving research interest from fields as diverse as Medicine to Social Sciences. The paper also identifies major keywords now associated with Big Data research such as Cloud Computing, Deep Learning, Social Media and Data Analytics. This helps in a thorough understanding and visualization of the Big Data research area. 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Keywords: Big data, big data analytics, data science, scientometrics 23 1. Introduction 24 Big Data refers to those datasets that are so large as 25 to pose challenges in storage and analysis via tradi- 26 tional data handling techniques. Big Data Analytics 27 (BDA) is the umbrella term given to the practice 28 of collecting, organizing and analyzing large sets 29 of data (Big Data). Big Data analytics allows orga- 30 nizations to comprehend the information contained 31 * Corresponding author. Vedika Gupta, Department of Com- puter Science and Engineering, National Institute of Technology Delhi, Delhi-110040, India. Tel.: +91 9910172545; E-mail: vedika.nit@gmail.com. within the data, in a better and sound manner and also 32 helps in identifying the data that provides insightful 33 knowledge for the current as well as future business 34 decisions. Big Data has percolated into possibly all 35 domains of technology and is gaining huge attention 36 from academia as well as industries and governments. 37 Big Data analytics as a research area has become very 38 important in recent years. It now encompasses all the 39 techniques used to analyze data at large scale span- 40 ning across health care, policy making, astronomy, 41 city planning, education, telecommunications, bank- 42 ing, IT and risk management, advertising, marketing 43 and other strategic business domains. 44 1064-1246/18/$35.00 © 2018 – IOS Press and the authors. All rights reserved