364 Int. J. Computational Science and Engineering, Vol. 21, No. 3, 2020
Copyright © 2020 Inderscience Enterprises Ltd.
Analysing research collaboration through
co-authorship networks in a big data environment:
an efficient parallel approach
Carlos Roberto Valêncio*, José Carlos de Freitas,
Rogéria Cristiane Gratão de Souza,
Leandro Alves Neves and
Geraldo Francisco Donegá Zafalon
Institute of Biosciences, Humanities and Exact Sciences (IBILCE),
São Paulo State University (UNESP),
Campus São José do Rio Preto, São Paulo, Brazil
Email: carlos.valencio@unesp.br
Email: jose.freitas@unesp.br
Email: rogeria.souza@unesp.br
Email: leandro.neves@unesp.br
Email: geraldo.zafalon@unesp.br
*Corresponding author
Angelo Cesar Colombini
Fluminense Federal University (UFF),
Niterói, Rio de Janeiro, Brazil
Email: accolombini@id.uff.br
William Tenório
Institute of Biosciences, Humanities and Exact Sciences (IBILCE),
São Paulo State University (UNESP),
Campus São José do Rio Preto, São Paulo, Brazil
Email: william.tenorio@unesp.br
Abstract: Bibliometry is the quantitative study of scientific productions and enables the
characterisation of scientific collaboration networks. However, with the development of science
and the increase of scientific production, large collaborative networks are formed, which makes it
difficult to extract bibliometrics. In this context, this work presents an efficient parallel
optimisation of three bibliometrics for co-authorship network analysis using multithread
programming: transitivity, average distance, and diameter. Our experiments found that the time
taken to calculate the transitivity value using the sequential approach grows 4.08 times faster
than the parallel proposed approach when the size of co-authorship network grows. Similarly, the
time taken to calculate the average distance and diameter values using the sequential approach
grows 5.27 times faster than the parallel proposed approach when the size of co-authorship
network grows. In addition, we report relevant values of speed up and efficiency for the
developed algorithms.
Keywords: bibliometrics; graphs; knowledge extraction; co-authorship network; NoSQL;
parallel computing.
Reference to this paper should be made as follows: Valêncio, C.R., de Freitas, J.C.,
de Souza, R.C.G., Neves, L.A., Zafalon, G.F.D., Colombini, A.C. and Tenório, W. (2020)
‘Analysing research collaboration through co-authorship networks in a big data environment: an
efficient parallel approach’, Int. J. Computational Science and Engineering, Vol. 21, No. 3,
pp.364–374.
Biographical notes: Carlos Roberto Valêncio is a Professor at Institute of Biosciences,
Humanities and Exact Sciences (IBILCE) of the São Paulo State University (UNESP) since
1989. He received his PhD degree in Computational Physics (2000) from the University of São
Paulo (USP). His research interests include relational databases, NoSQL databases, data mining,
spatial data mining, knowledge discovery processes and geographic information systems.