Physica D 238 (2009) 1161–1167
Contents lists available at ScienceDirect
Physica D
journal homepage: www.elsevier.com/locate/physd
Modularity density of network community divisions
Erik Holmström
a,b,∗
, Nicolas Bock
b
, Johan Brännlund
c
a
Instituto de Física, Universidad Austral de Chile, Valdivia, Chile
b
Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
c
Department of Mathematics & Statistics, Dalhousie University, Halifax, NS B3H 3J5, Canada
article info
Article history:
Received 3 April 2007
Received in revised form
7 March 2008
Accepted 23 March 2009
Available online 5 April 2009
Communicated by A. Doelman
Keywords:
Modularity
Modularity density
Network clusters
Network communities
abstract
The problem of dividing a network into communities is extremely complex and grows very rapidly with
the number of nodes and edges that are involved. In order to develop good algorithms to identify optimal
community divisions it is extremely beneficial to identify properties that are similar for most networks.
We introduce the concept of modularity density, the distribution of modularity values as a function of the
number of communities, and find strong indications that the general features of this modularity density
are quite similar for different networks. The region of high modularity generally has very low probability
density and occurs where the number of communities is small. The properties and shape of the modularity
density may give valuable information and aid in the search for efficient algorithms to find community
divisions with high modularities.
© 2009 Elsevier B.V. All rights reserved.
1. Introduction
The nodes of a network can be grouped into communities
which are loosely defined as groups of nodes that are more
‘‘related’’ to each other in some fashion than they are related to
the rest of the network. Such a community division can reveal
important structures of the network. In a recent study, for instance,
Wilkinson and Huberman [1] introduced a method to create a
network of gene co-occurrences from the literature and interpret
its communities as groups of genes related to each other by
their function. Since some of the genes in these communities
are not known to be related to the community’s function, this
method possibly aids in identifying unknown relationships of
this sort Massen and Doye [2] used a community analysis on a
potential energy landscape to identify transition states of small
Lennard–Jones clusters. Networks have also been very successfully
used to simulate dynamics in various systems. By modeling
a community structure of individuals using a contact network
model, Meyers et al. [3] predicted the dynamics of a SARS outbreak.
It is very difficult to find a good partitioning of a network into
communities. In fact, maximizing the modularity is NP-hard [4].
Many different approaches have been used to identify commu-
nity structures in networks. To name a few more recent meth-
ods: vertex similarity [5], vertex degree gradient [6], resistor
∗
Corresponding author at: Instituto de Física, Universidad Austral de Chile,
Valdivia, Chile. Tel.: +56 63225938.
E-mail addresses: erikh@lanl.gov, eholmstrom@uach.cl (E. Holmström).
network [7], Potts Hamiltonian model [8], and an information–
theoretic approach [9]. For some comparative reviews of commu-
nity identification methods, see Refs. [10,11].
The most popular methods appear to be ones based on
the network modularity Q introduced by Newman and co-
workers [12–16]. The advantage with the modularity Q is that
it is a well defined number that gives the quality of a particular
community division in a network. It is larger for divisions that split
the network into groups with many intra-edges and few inter-
edges between the groups.
A number of different strategies have been proposed for
finding the optimal community division based on the modularity.
These methods can be broadly divided into two different classes.
Path-bound methods are agglomerative or divisive and either
successively add or take away edges in the network so as to reduce
the number of communities by merging existing communities
(agglomerative) or to increase the number of communities by
taking away edges and splitting existing communities (divisive). In
both cases, the number of possible community divisions depends
on the previous steps in the algorithm, or the particular path
that was taken in the space of all possible community divisions.
The resulting evolution of the community structure is commonly
called a dendrogram. The different methods in this class differ
in the way they identify the edges to be removed or added.
Examples are the shortest-path betweenness [14], random-path
betweenness [14], or the greedy algorithm [16,17]. All these
methods have in common that they follow a dendrogram and
attempt to identify the edges to be removed or added by optimizing
the effected modularity change. The number of communities is
0167-2789/$ – see front matter © 2009 Elsevier B.V. All rights reserved.
doi:10.1016/j.physd.2009.03.015