Molecular Ecology (2006) 15, 2857–2869 doi: 10.1111/j.1365-294X.2006.02992.x
© 2006 The Authors
Journal compilation © 2006 Blackwell Publishing Ltd
Blackwell Publishing Ltd
Measurement of biological information with applications
from genes to landscapes
WILLIAM B. SHERWIN,*† FRANCK JABOT,*‡ REBECCA RUSH * and MAURIZIO ROSSETTO §
*School of Biological Earth and Environmental Science, University of New South Wales, Sydney, NSW 2052, Australia,
†Institut Des Sciences de l’Evolution, Université Montpellier 2, cc 065, Place Eugène Bataillon, 34095 Montpellier, Cedex 05 France,
‡Ecole Polytechnique 91128 Palaiseau, Cedex Paris, France, §National Herbarium of New South Wales, Botanic Gardens Trust, Mrs
Macquarie’s Road, Sydney, NSW 2000, Australia
Abstract
Biological diversity is quantified for reasons ranging from primer design, to bioprospect-
ing, and community ecology. As a common index for all levels, we suggest Shannon’s
S
H,
already used in information theory and biodiversity of ecological communities. Since
Lewontin’s first use of this index to describe human genetic variation, it has been used for
variation of viruses, splice-junctions, and informativeness of pedigrees. However, until
now there has been no theory to predict expected values of this index under given genetic
and demographic conditions. We present a new null theory for
S
H at the genetic level, and
show that this index has advantages including (i) independence of measures at each hier-
archical level of organization; (ii) robust estimation of genetic exchange over a wide range
of conditions; (iii) ability to incorporate information on population size; and (iv) explicit
relationship to standard statistical tests. Utilization of this index in conjunction with other
existing indices offers powerful insights into genetic processes. Our genetic theory is also
extendible to the ecological community level, and thus can aid the comparison and integra-
tion of diversity at the genetic and community levels, including the need for measures of
community diversity that incorporate the genetic differentiation between species.
Keywords: biodiversity, dispersal, population genetics, Shannon information, subdivision
Received 8 February 2006; revision accepted 18 April 2006
Introduction
There is enthusiasm for merging biodiversity databases
over a range of levels: ecosystems, species, genes (Sugden &
Pennisi 2000). There is also interest in explicitly comparing
genetic and species diversity between areas (Vellend 2005).
However, there has been little attention to providing common
measures for biodiversity at different levels, thus risking
comparisons of ‘apples with oranges’. Crist et al. (2003)
said that ‘Despite a growing empirical interest in diversity
partitioning, however, its use is still descriptive with little
theoretical basis for interpreting the observed patterns of α
and β diversity … or statistical methods for testing null
hypotheses on observed diversity partitions.’ There have
been a number of formulations of hierarchical biodiversity
(Whittaker 1972), with various uses of the terms α, β and γ.
It is now accepted that criteria for a good diversity measure
include (i) additivity, so that the highest level γ is equal to
the sum of diversity at local level, α, plus diversity between
localities β; (ii) concavity, so that total diversity up to and
including a particular level of organization is always
greater or equal to diversity at the lower level, and thus α,
β and γ are never negative; (iv) low bias, the systematic
deviation between the estimate and the true value; (v) low
imprecision, often expressed as the coefficient of variation
of the estimate (CV); and (vi) low root mean square error
(RMSE), a summary of the combined effects of bias and
imprecision (Allen 1975; Routledge 1979; Lande 1996;
Watve & Gangal 1996; Hubalek 2000; Lande et al. 2000;
Crist et al. 2003). There has been less attention to the
question ‘Does the index measure what we want to measure?’
In our opinion, this stems from the lack of biological basis
to most of the measures — except in limited and extreme
cases, we do not know what values we expect to see in a
population or community with a particular history. We
first discuss the statistical properties of some common
indices, then summarize available theoretical expectations
Correspondence: William B. Sherwin, Fax: 61 (0)2-9385-1558;
E-mail: w.sherwin@unsw.edu.au