Conceptual Framework for Agent-based Modeling and Simulation Yongqin Gao Department of Computer Science and Engineering University of Notre Dame ygao1@nd.edu Vince Freeh Department of Computer Science North Carolina State University vin@cse.ncsu.edu Greg Madey Department of Computer Science and Engineering University of Notre Dame gmadey@nd.edu Abstract The open Source Software [6] (OSS) development movement is a classic example of a collaborative social network [4, 5, 6]; it is also a prototype of a complex evolving network [2, 19, 20]. By collecting developer and project information monthly from SourceForge for over two years, we have sufficient data to infer the structural and the dynamic mechanisms that govern the topology and evolution of this complex social network system, using agent based modeling and simulation techniques [9, 10]. In this paper, we present the process of building empirically derived agent-based models of the SourceForge OSS developer network and simulation of this collaborative social network. We accomplish our goal by extracting statistics of the OSS SourceForge network, including snapshots and longitudinal data. Several network models of the evolution of SourceForge, the simulation library used [8], and the verification and validation process are given in this paper. The hidden nature of social network processes that could plausibly generate the observed system properties are discovered from an iterative modeling, simulation, validation and verification process. We proposed a conceptual framework for agent-based modeling and simulation, as shown in Figure 1. We have three entities in the framework: empirical data collection, model and simulation. The model is the process description that is implemented in simulation, and by which we can reproduce the evolution of empirical data. The simulation is a tool to verify and validate the model. We also have six edges in the framework, each represents one kind of process in the framework. Characterization is a process to abstract the characteristics of the empirical data and to generate the rules and attributes in the model. Description is a process to manifest the underlying mechanisms of the empirical data evolution. Generation is a process to build a simulation based on the given model. Adjustment is a process to modify the model according to the feedback from verification process. Verification is a process testing the simulation’s behaviors by comparing the simulation output with the empirical data and the designed model behaviors. Validation is a process of interpolating or extrapolating the simulation output and comparing with the empirical data and its attributes not used to define the model. Validation may add more rules or attributes into the model for prospective improvement. The main goal of this kind of study is to get a “fit” model to describe the evolution of a collaborative network by simulation iterations. We explain this conceptual framework using a real example – a study of the SourceForge collaboration network [1]. The