FOCAS: Penalising friendly citations to improve author ranking
Jorge Silva
CRACS/INESC TEC & University of Porto
Porto, Portugal
jorge.m.silva@inesctec.pt
David Aparício
CRACS/INESC TEC & University of Porto
Porto, Portugal
daparicio@dcc.fc.up.pt
Pedro Ribeiro
CRACS/INESC TEC & University of Porto
Porto, Portugal
pribeiro@dcc.fc.up.pt
Fernando Silva
CRACS/INESC TEC & University of Porto
Porto, Portugal
fmsilva@dcc.fc.up.pt
ABSTRACT
Scientifc impact is commonly associated with the number of cita-
tions received. However, an author can easily boost his own citation
count by (i) publishing articles that cite his own previous work
(self-citations), (ii) having co-authors citing his work (co-author cita-
tions), or (iii) exchanging citations with authors from other research
groups (reciprocated citations). Even though these friendly citations
infate an author’s perceived scientifc impact, author ranking al-
gorithms do not normally address them. They, at most, remove
self-citations. Here we present Friends-Only Citations AnalySer
(FOCAS), a method that identifes friendly citations and reduces
their negative efect in author ranking algorithms. FOCAS com-
bines the author citation network with the co-authorship network
in order to measure author proximity and penalises citations be-
tween friendly authors. FOCAS is general and can be regarded as an
independent module applied while running (any) PageRank-like au-
thor ranking algorithm. FOCAS can be tuned to use three diferent
criteria, namely authors’ distance, citation frequency, and citation
recency, or combinations of these. We evaluate and compare FO-
CAS against eight state-of-the-art author ranking algorithms. We
compare their rankings with a ground-truth of best paper awards.
We test our hypothesis on a citation and co-authorship network
comprised of seven Information Retrieval top-conferences. We ob-
served that FOCAS improved author rankings by 25% on average
and, in one case, leads to a gain of 46%.
CCS CONCEPTS
· Human-centered computing → Social network analysis; ·
Computing methodologies → Ranking;
KEYWORDS
Author ranking, self-citations, friendly citations, citation networks,
co-authorship networks
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specifc permission and/or a
fee. Request permissions from permissions@acm.org.
SAC ’20, March 30-April 3, 2020, Brno, Czech Republic
© 2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-6866-7/20/03. . . $15.00
https://doi.org/10.1145/3341105.3373991
ACM Reference Format:
Jorge Silva, David Aparício, Pedro Ribeiro, and Fernando Silva. 2020. FO-
CAS: Penalising friendly citations to improve author ranking. In The 35th
ACM/SIGAPP Symposium on Applied Computing (SAC ’20), March 30-April
3, 2020, Brno, Czech Republic. ACM, New York, NY, USA, 9 pages. https:
//doi.org/10.1145/3341105.3373991
1 INTRODUCTION
Deciding where (or to whom) to allocate research funding is a
problem that afects all scientists directly. This is typically done
by attempting to assess the impact of a scientist, that is, to deter-
mine how much of his research work has contributed to advance his
scientifc feld. The impact of scientists is also commonly used to
pick scientifc committees, attribute research grants, or decide fac-
ulty promotions. These processes are not fully automated and are
traditionally done by peers. However, bibliometrics can be of help
since they provide an unbiased estimator of scientifc impact. For
example, the h-index [5] counts the number of publications that a
scientist (or author) has with more than h citations (e.g., an author
has h-index = 7 if he has 7 papers with at least 7 citations). Many
variations of the h-index have been proposed [8, 10] but the h-index
remains widely used.
Another common approach to evaluate an author’s impact is to
use graph metrics on citation networks. Computing graph metrics
is computationally more expensive than calculating bibliometrics,
but has some advantages, namely (i) they give credit for indirect
citations (i.e., if A cites B, and B cites C, C receives part of the
credit of the citation of A to B), and (ii) they measure the author’s
impact at a group scale, that is the impact of each author depends
on the impact of the authors that cite him. PageRank [9] is the
most widely used graph algorithm to measure author’s impact,
and many variations have been proposed specifcally for author
ranking [2, 3, 11, 14, 16, 19]. One of PageRank’s major algorithmic
ideas is that nodes are not all equal, i.e., in its original context of
hyperlinks, it is good that any webpage points at yours but it is better
that important webpages point at yours. This idea naturally extends
to author citation networks, meaning that it is good to be cited by
any author but it is better to be cited by important authors.
Regardless of the metric used to evaluate scientifc impact (e.g.,
bibliometrics or graph metrics), citations are important and several
works study how an author can increase his number of citations.
Undoubtedly the quality of the author’s work is correlated with
his number of citations [17]. However, other factors such as the au-
thor’s co-authorship network [12] and his social behaviour [4, 15]