Symposium on Complex Systems Engineering, The RAND Corporation, Santa Monica, CA, January 11-12, 2007 Untangling the Information Web of Complex System Design Dan Braha University of Massachusetts Dartmouth, MA, USA New England Complex Institute Cambridge, MA, USA braha@necsi.org, dbraha@umassd.edu http://necsi.org/affiliates/affiliates.html Yaneer Bar-Yam New England Complex Institute Cambridge, MA, USA yaneer@necsi.org http://necsi.org/faculty/faculty.html Abstract - Understanding the structure and function of complex networks has recently become the foundation for explaining many different real-world complex biological, technological and informal social phenomena. The analysis of these networks has uncovered surprising statistical structural properties that have also been shown to have a major effect on their functionality, dynamics, robustness, and fragility. This paper examines the statistical properties of large-scale product development networks -- and discusses the significance of these properties in providing insight into ways of improving the strategic and operational decision-making of the organization. We believe that our new analysis methodology and empirical results are also relevant to other organizational information-carrying networks. Keywords: Complex Product Design, Large-Scale Engineering Systems, Collective Decision Making, Social Networks, Complex Systems, Robustness, Network Dynamics, Network Effects, Tippy Dynamics 1 The Connectivity Syndrome A large-scale product design and development process (PD) is a distributed problem solving activity with hundreds of designers carrying out tasks and revising their actions based on other peoples input [7-9]. If the amount of information generated by project participants is not properly controlled, project participants will be expected to act on this information creating even more information for other people to act on, etc. This chain reaction will result in unintentional delays that were not accounted for at the outset of the project. Which patterns of information connectivity lead to better project performance? What are the patterns of information connectivity observed in real- world large-scale PD organizations? Do these observed patterns share common principles? And what can the structure of connectivity teach us about PD dynamics? The above questions are addressed by applying techniques from complex networks theory, which is reviewed in Section 2. In Section 3, we present an analysis of the PD task networks, their ‘small-world’ property, and node connectivity distributions. We demonstrate the distinct roles of incoming and outgoing information flows in distributed PD processes by analyzing the corresponding in-degree and out-degree link distributions. In Sections 4 and 5, we show that the statistical structural properties of PD projects have a major effect on their functionality, dynamics, robustness, and fragility. In Section 6 we present our conclusions. 1 Structural Properties of Complex Networks Complex networks can be defined formally in terms of a graph ) , ( E V G = , which is a pair of nodes } ,..., 2 , 1 { N V = , and a set of lines } ,..., , { 2 1 L e e e E = between pairs of nodes. If the line between two nodes is non-directional, then the network is called undirected ; otherwise, the network is called directed . A network is usually represented by a diagram, where nodes are drawn as points, undirected lines are drawn as edges and directed lines as arcs connecting the corresponding two nodes. Three properties have been used to characterize ‘real- world’ complex networks [9, 10]. The first characteristic is the average distance (geodesic) between two nodes, where the distance ) , ( j i d between nodes i and j is defined as the number of edges along the shortest path connecting them. The characteristic path length l is the average distance between any two vertices: ∑ − = ≠ j i ij d N N ) 1 ( 1 l (1) The second characteristic measures the tendency of vertices to be locally interconnected or to cluster in dense modules. The clustering coefficient i C of a vertex i is defined as follows. Let vertex i be connected to i k neighbors. The total number of edges between these neighbors is at most 2 ) 1 ( − i i k k . If the actual number of edges between these i k neighbors is i n , then the clustering coefficient i C of the vertex i is the ratio