Abstract—Software project effort estimation is frequently seen as complex and expensive for individual software engineers. Software production is in a crisis. It suffers from excessive costs. Software production is often out of control. It has been suggested that software production is out of control because we do not measure. You cannot control what you cannot measure. During last decade, a number of researches on cost estimation have been conducted. The metric-set selection has a vital role in software cost estimation studies; its importance has been ignored especially in neural network based studies. In this study we have explored the reasons of those disappointing results and implemented different neural network models using augmented new metrics. The results obtained are compared with previous studies using traditional metrics. To be able to make comparisons, two types of data have been used. The first part of the data is taken from the Constructive Cost Model (COCOMO'81) which is commonly used in previous studies and the second part is collected according to new metrics in a leading international company in Turkey. The accuracy of the selected metrics and the data samples are verified using statistical techniques. The model presented here is based on Multi-Layer Perceptron (MLP). Another difficulty associated with the cost estimation studies is the fact that the data collection requires time and care. To make a more thorough use of the samples collected, k-fold, cross validation method is also implemented. It is concluded that, as long as an accurate and quantifiable set of metrics are defined and measured correctly, neural networks can be applied in software cost estimation studies with success Keywords—Software Metrics, Software Cost Estimation, Neural Network. I. INTRODUCTION OFTWARE becomes increasingly expensive to develop and is a major cost factor in any information system budget. Software development costs often get out of control due to lack of measurement and estimation methodologies. Software cost estimation or software effort estimation is the process of predicting the effort required to develop a software system. Software engineering cost models and estimation techniques are used for a number of purposes including; budgeting, tradeoff and risk analysis, project planning and control, and software improvement investment analysis [1]. Manuscript received August 31, 2006. A Metric-Set and Model Suggestion for Better Software Project Cost Estimation M. Ayyıldız is with the Computer Engineering Department,Yıldız Technical University, Istanbul, Turkey (e-mail: f0100301@ yildiz..edu.tr) O. Kalıpsız is with the Computer Engineering Department,Yıldız Technical University, Istanbul, Turkey (e-mail: kalipsiz@ ce.yildiz..edu.tr). S. Yavuz is with the Computer Engineering Department,Yıldız Technical University, Istanbul, Turkey (e-mail: sirma@ce.yildiz.edu.tr ). The accuracy of the software project cost estimation has a direct and significant impact on the quality of the firm’s software investment decisions [2]. Accurate cost estimation can reduce the unnecessary costs and increase the organization’s efficiency. For this reason, many estimation models have been proposed over the last 20 years. The review completed by Jørgensen and Shepperd [3] identifies 304 software cost estimation papers in 76 journals and classifies the papers according to research topic, estimation approach, research approach, study context and data set. Although there are number of different approaches, these models may be classified as algorithmic and non-algorithmic. Each of these techniques has their advantages as well as limitations. Unfortunately, despite the large body of experience with estimation models, the accuracy of these models is still far from being satisfactory [4]. Software development effort estimation with the aid of artificial neural networks (ANN) attracted considerable research interest especially at the beginning of the nineties [5]. Most of these studies are based on COCOMO’81 metric-set. A key factor in selecting a cost estimation model is the accuracy of its metrics, since these models rely on metrics as their input. Metric can be defined as a quantitative measure of the degree to which a system, component, or process possesses a given attribute. It may seem easy to think of attributes of computer software products, processes, people or programming environments that can be measured. However, identifying meaningful attributes to measure and then finding measurement processes to produce reliable and reproducible assessments of these attributes is the real problem [6]. As the first step of our study, an augmented metric set was developed to take new technologies and more meaningful parameters into account. This augmented model, called Yıldız effort estimation model (YEEM), does include all the metrics existed in COCOMO and new ones which are derived from more recent software development projects, studies and experience. In this paper, we present an augmented metric set and investigate the success of the MLP for both COCOMO’81 metric set and for the augmented metric set (YEEM). Tests were run for datasets containing the same number of samples. To investigate further, we have also experimented with larger datasets formed according to augmented metrics. Addressing the issues of the dataset characteristics and the amount of samples in the datasets is one of the purposes of this research. Since the amount of samples we have collected are still limited, 5-fold, 10 fold and 15-fold cross validation A Metric-Set and Model Suggestion for Better Software Project Cost Estimation Murat Ayyıldız, Oya Kalıpsız, and Sırma Yavuz S World Academy of Science, Engineering and Technology 23 2006 167