ARIMAmmse: An Improved ARIMA-based
Software Productivity Prediction Method
Li Ruan
1,2
, Yongji Wang
1,3
,Qing Wang
1
, Fengdi Shu
1
, Haitao Zeng
1,2
,Shen Zhang
1,2
1
Laboratory for Internet Software Technologies, Institute of Software,
The Chinese Academy of Sciences, Beijing 100080, China
2
Graduate University, The Chinese Academy of Sciences, Beijing 100039, China
3
Key Laboratory for Computer Science, The Chinese Academy of Sciences, Beijing 100080, China
{ruanli,ywang,wq, fdshu,zenghaitao,zhangshen}@ itechs.iscas.ac.cn
Abstract
Productivity is a critical performance index of
process resources. As successive history productivity
data tends to be auto-correlated, time series
prediction method based on Auto-Regressive
Integrated Moving Average (ARIMA) model was
introduced into software productivity prediction by
Humphrey et al. In this paper, a variant of their
prediction method named ARIMAmmse is proposed.
This variant formulates the ARIMA parameter
estimation issue as a minimum mean square error
(MMSE) based constrained optimization problem. The
ARIMA model is used to describe constraints of the
parameter estimation problem, while MMSE is used
as the objective function of the constrained
optimization problem. According to the optimization
theory, ARIMAmmse will definitely achieve a higher
MMSE prediction precision than Humphrey et al’s
which is based on the Yule-Walk estimation technique.
Two comparative experiments are also presented. The
experimental results further confirm the theoretical
superiority of ARIMAmmse.
1. Introduction
Developing reliable and high quality software
requires a well-coordinated and executed software
process. From 1980s, software process technology [1,
2] has emerged as a new discipline to develop software
systems with expected quality (e.g. dependability and
security) requirements.
Productivity is a critical performance index of
software process resources. Precise software
productivity prediction forms the first premise to
achieve optimal resource schedule, task assignment
and cost control [2]. However, current software
process is a highly evolving process with often-
changing technologies and development methods [3].
This requires productivity prediction method to
achieve a balance between forecasting stability and
responsiveness to changing conditions.
Furthermore, Humphrey et al in Carnegie Mellon
University Software Engineering Institute (CMU-SEI)
stated in [1] that “Because of the nature of software
work, the successive development times for individual
programmer or programming teams will also be auto-
correlated. This is because of an underlying learning
process which tends to improve successive
development productivities.” Therefore, the
productivity data generated by software process
usually displays patterns, time-dependent and
successive characteristics [4]. However, most of the
current typical productivity prediction methods (e.g.
CORBRA [5, 6], etc.) are not under the auto-
correlated successive assumption. [7] proposes a
learning curve-based productivity prediction method
that takes the dynamic characteristics into
consideration. But the structure of the learning curve
is pre-defined.
Time Series Analysis (TSA) method based on
ARIMA model [4], which has already achieved
successful application in finance and industrial control,
was introduced into software process field for software
productivity prediction by Humphrey et al in [1].
Parameter estimation is a critical step to establish an
ARIMA model. Inaccurate estimation will result in
large prediction variance and low prediction precision.
The Yule-Walk technique, which was adopted by
Humphrey et al in [1], is one of the most typical
parameter estimation techniques [4]. In the Yule-Walk
technique, first, a set of auto-correlation function
values of the series are calculated. Next, according to
the relationship between the parameters and the auto-
correlation function values embedded in the Yule-
Walk equation, parameters are estimated through the
solution to the equation. Although this technique is
easy to use, many studies (e.g. [4] ) proved that it
Proceedings of the 30th Annual International Computer Software and Applications Conference (COMPSAC'06)
0-7695-2655-1/06 $20.00 © 2006