Sampling of Networks
with Traceroute-Like
Probes
Alain Barrat
a
Ignacio Alvarez-Hamelin
a
Luca Dall’Asta
a
Alexei Vázquez
b
Alessandro Vespignani
a, c
a
Laboratoire de Physique Théorique (UMR 8627 du CNRS), Université de Paris-Sud,
Orsay, France;
b
Nieuwland Science Hall, University of Notre Dame, Notre Dame, Ind., and
c
School of Informatics and Department of Physics, Indiana University, Bloomington, Ind., USA
A. Barrat, LPT
Bâtiment 210
Université de Paris-Sud, FR–91405 Orsay Cedex (France)
Tel. +33 1 69 15 82 22, Fax +33 1 69 15 82 87,
E-Mail Alain.Barrat@th.u-psud.fr
© 2006 S. Karger AG, Basel
1424–8492/06/0033–0083
$23.50/0
Complexus 2006;3:83–96
INFORMATION TECHNOLOGY MODELLING
Key Words
Network sampling Traceroute Internet exploration Topology inference
Abstract
A large part of the recent development of the interest in complex networks has
been triggered by the observation of particular characteristics of real world net-
works, such as the small-world properties or the heavy-tailed distributions of
degrees. Many datasets are, however, the result of an incomplete sampling of
the underlying real networks, and it has been argued that sampling procedures
might introduce uncontrolled biases in the statistical properties of the sampled
graph. In this paper, we explore this issue in the case of the Internet, which is
generally mapped from a limited set of sources by using traceroute-like probes.
The origin of the biases introduced by such a sampling process is investigated
Published online: August 25, 2006
DOI: 10.1159/000094191
Fax +41 61 306 12 34
E-Mail karger@karger.ch
www.karger.com
Accessible online at:
www.karger.com/cpu
Simplexus
Despite the pervasive influence of to-
day’s Internet, for business and science,
government, and just about every other
area of human activity, it is not at all easy
to get a good snapshot of what the network
looks like; that is, of the pattern by which
its computers are ‘wired’ together by tele-
phone lines, satellite links and so on. The
Internet has grown up organically, and
without any central authority making it
conform to some rational design. Hence, its
structure can only be determined through
careful measurement. The usual technique
is a kind of Internet X-ray, carried out by
sending information packets from a hand-
ful of Internet sources, targeted towards
many destinations. Along the way, such
packets record the pathways defined by the
links on which they travel, and so return
narrow ‘slices’ of the overall network. By
putting many such slices together, re-
searchers have assembled crude maps of
the Internet, and discovered what seem to
be a number of interesting topological fea-
tures.
For example, the Internet appears to be
a ‘small world’ network in that any pair of
node computers (or ‘routers’) can be linked
by short paths having only a handful of
links. The Internet also appears to have a
highly ‘heterogeneous’ character, in that a
few ‘hub’ nodes have hundreds or thou-
sands of links, while most have only one or
a few. The latter feature appears in the par-
ticular mathematical form of the distribu-
tion of nodes according to the number of
links they have (known as the ‘degree dis-
tribution’). For the Internet, as for many
other complex networks in physics or biol-
ogy, this seems to follow a power law form,
P (k ) k
–
, where is between 2.0 and 2.5.
The small world and heterogeneous char-
acter of the Internet appear to be key to its
capacity for stable and resilient operation
even in the presence of frequent computer
failures, and understanding its topology in
detail is crucial for the intelligent design of