Empirical Software Engineering, 5, 35–68 (2000) c 2000 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. A Simulation Tool for Efficient Analogy Based Cost Estimation L. ANGELIS lef@csd.auth.gr Department of Informatics, Aristotle University of Thessaloniki, 54006, Thessaloniki, GREECE I. STAMELOS stamelos@csd.auth.gr Department of Informatics, Aristotle University of Thessaloniki, 54006, Thessaloniki, GREECE Abstract. Estimation of a software project effort, based on project analogies, is a promising method in the area of software cost estimation. Projects in a historical database, that are analogous (similar) to the project under examination, are detected, and their effort data are used to produce estimates. As in all software cost estimation approaches, important decisions must be made regarding certain parameters, in order to calibrate with local data and obtain reliable estimates. In this paper, we present a statistical simulation tool, namely the bootstrap method, which helps the user in tuning the analogy approach before application to real projects. This is an essential step of the method, because if inappropriate values for the parameters are selected in the first place, the estimate will be inevitably wrong. Additionally, we show how measures of accuracy and in particular, confidence intervals, may be computed for the analogy-based estimates, using the bootstrap method with different assumptions about the population distribution of the data set. Estimate confidence intervals are necessary in order to assess point estimate accuracy and assist risk analysis and project planning. Examples of bootstrap confidence intervals and a comparison with regression models are presented on well-known cost data sets. Keywords: Software cost estimation, bootstrap samples, confidence intervals, distance metrics, estimation by analogy, regression models 1. Introduction The more software becomes important in almost every human activity, the more it becomes complex and difficult to implement. Even if modern software technologies render easier the development of certain types of software products, increased user demands and new application domains produce additional problems. It is not surprising that software project management activities are becoming increasingly important. One of the most critical activities during the software life cycle is that of estimating the effort and time involved in the development of the software product under consideration. This task is known as Software Cost Estimation. Estimations may be performed before, during and after the development of software. The cost and time estimates are necessary during the first phases of the software life cycle, in order to decide whether to proceed or not (feasibility study). Accurate estimates are obtained with great difficulty since, at this point, available data may not be precise, wrong assumptions may be made, etc. During the development process, the cost and time estimates are useful for the initial rough validation and the monitoring of the project’s progress. After completion, these estimates may be useful for project productivity assessment.