Jointly published by Akadémiai Kiadó, Budapest Scientometrics, Vol. 79, No. 2 (2009) 311–327
and Springer, Dordrecht DOI: 10.1007/s11192-009-0420-4
Received December 5, 2007
Address for correspondence:
SUZY RAMANANA-RAHARY
E-mail: suzy.ramanana@obs-ost.fr
0138–9130/US $ 20.00
Copyright © 2008 Akadémiai Kiadó, Budapest
All rights reserved
Aggregation properties of relative impact and
other classical indicators:
Convexity issues and the Yule–Simpson paradox
SUZY RAMANANA-RAHARY,
a
MICHEL ZITT,
a,b
RONALD ROUSSEAU
c,d
a
Observatoire des Sciences et des Techniques (OST), Paris, France
b
INRA-Lereco, Nantes, France
c
KHBO – Industrial Sciences and Technology, Zeedijk 101, B-8400 Oostende, Belgium
d
K.U. Leuven, Steunpunt O&O Indicatoren and Dept. MSI, Dekenstraat 2, B-3000 Leuven, Belgium
Among classical bibliometric indicators, direct and relative impact measures for countries or
other players in science are appealing and standard. Yet, as shown in this article, they may exhibit
undesirable statistical properties, or at least ones that pose questions of interpretation in evaluation
and benchmarking contexts. In this article, we address two such properties namely sensitivity to
the Yule–Simpson effect, and a problem related to convexity. The Yule–Simpson effect can occur
for direct impacts and, in a variant form, for relative impact, causing an apparent incoherence
between field values and the aggregate (all-fields) value. For relative impacts, it may result in a
severe form of ‘out-range’ of aggregate values, where a player’s relative impact shifts from ‘good’
to ‘bad’, or conversely. Out-range and lack of convexity in general are typical of relative impact
indicators. Using empirical data, we suggest that, for relative impact measures, ‘out-range’ due to
lack of convexity is not exceptional. The Yule–Simpson effect is less frequent, and especially
occurs for small players with particular specialisation profiles.
Introduction
Typical dimensions of scientometric indicators are the geographic, the thematic and
the time dimension. A simple indicator of publication or citation will be presented as an
n-tuple, for instance: measure, player, field, period, etc. Similarly, a ‘geographically
bidimensional’ indicator (such as co-authorship) will be represented by an n-tuple:
measure, player, player, field, period, etc.
A global set, say all (i.e. “total world”) publications collected over a certain period
in a given database, can be partitioned according to the same criteria as used to
construct the above-mentioned n-tuples. Typical examples are geographic and thematic
breakdowns. Assuming counting methods that do not produce overlaps, for example
fractional counting, the data can be displayed in a matrix in the form of a contingency
table, player (i.e. region, or country) × field (or theme) in our example.