770 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 5, NO. 6, DECEMBER 1997 A Quantitative Comparison of Graph-Based Models for Internet Topology Ellen W. Zegura, Member, IEEE, Kenneth L. Calvert, Member, IEEE, and Michael J. Donahoo Abstract—Graphs are commonly used to model the topological structure of internetworks in order to study problems ranging from routing to resource reservation. A variety of graphs are found in the literature, including fixed topologies such as rings or stars, “well-known” topologies such as the ARPAnet, and randomly generated topologies. While many researchers rely upon graphs for analytic and simulation studies, there has been little analysis of the implications of using a particular model or how the graph generation method may affect the results of such studies. Further, the selection of one generation method over another is often arbitrary, since the differences and similarities between methods are not well understood. This paper considers the problem of generating and selecting graphs that reflect the properties of real internetworks. We review generation methods in common use and also propose several new methods. We consider a set of metrics that characterize the graphs produced by a method, and we quantify similarities and differences among several generation methods with respect to these metrics. We also consider the effect of the graph model in the context of a specific problem, namely multicast routing. Index Terms— Internetworking, multicast, network modeling, scalability. I. INTRODUCTION A. Background T HE explosive growth of the Internet has been accompa- nied by a wide range of internetworking problems related to routing, resource reservation, and administration. The study of algorithms and policies to address such problems often involves simulation or analysis using an abstraction or model of the actual network structure and applications. The reason is clear; networks that are large enough to be interesting are also expensive and difficult to control, therefore they are rarely available for experimental purposes. Moreover, it is generally more efficient to assess solutions using analysis or simulation, provided the model is a “good” abstraction of the real network and application. It is therefore remarkable that studies based on randomly generated or trivial network topologies are so common, while rigorous analyses of how the results scale or how they might change with a different topology are so rare. The state of the art in network modeling includes: 1) regular topologies, such as rings, trees, and stars (e.g., [6], [13], [23]); Manuscript received August 26, 1996; revised July 1, 1997; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor D. Estrin. The authors are with the College of Computing, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: ewz@cc.gatech.edu). Publisher Item Identifier S 1063-6692(97)08486-0. 2) “well-known” topologies, such as the ARPAnet or NSFnet backbone (e.g., [1], [18], [24]); 3) randomly generated topologies (e.g., [19]–[21]). The limitations of each of these are obvious. Well-known and regular topologies reflect only parts of current or past real net- works; random topologies may not reflect any (past, present, or future) real network. Also clear from the cited references is the diverse set of problems that relies on network models to evaluate performance. Furthermore, most researchers seem to be aware of the perils of reaching conclusions about real networks based on these models; it is typical for papers to include a disclaimer to this effect. To illustrate the important role that the network model can play in assessing algorithms, consider the following results. 1) Doar and Leslie found that the efficiency of their dy- namic multicasting algorithms was reduced by as much as half when using random graphs versus using hierar- chically structured graphs designed to reflect some of the properties of real internetworks. (See Fig. 5 in [11].) 2) Wei and Estrin found that the traffic concentration in core-based multicast routing trees is comparable to traf- fic concentration in shortest path trees for a network model with average node degree of about 3.0, but the traffic concentration is almost 30% higher in core-based trees when average node degree increased to 8.0. (See Fig. 9(a) in [22].) 3) Mitzel and Shenker found that multicast resource reser- vation styles compared quite differently in the quantity of resources reserved for linear, tree and star topologies. (See Tables IV and V in [13].) It should be clear from these examples that the network model does matter. The conclusions reached about the suitability and performance of algorithms may vary depending on the methods used to model the network. A variety of criteria may be applied to assessing a network model, depending in large part on the intended use. For ex- ample, if the purpose is to stress test the algorithm, the model should generate instances which are, in some sense, “difficult.” For the problem of routing, this may mean topologies with moderate node degree and many routing choices. If the purpose is to model a particular (static) network (e.g., a campus or corporate network), then the model should accurately reflect the current topology. Historically, large networks such as the Public Switched Telephone Network have grown according to a topological design developed by some central authority or administra- 1063–6692/97$10.00  1997 IEEE