Journal of Statistical Planning and Inference 138 (2008) 552 – 567
www.elsevier.com/locate/jspi
Efficient mean estimation in log-normal linear models
Haipeng Shen, Zhengyuan Zhu
∗
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Received 6 April 2006; received in revised form 28 September 2006; accepted 13 October 2006
Available online 14 March 2007
Abstract
Log-normal linear models are widely used in applications, and many times it is of interest to predict the response variable or to
estimate the mean of the response variable at the original scale for a new set of covariate values. In this paper we consider the problem
of efficient estimation of the conditional mean of the response variable at the original scale for log-normal linear models. Several
existing estimators are reviewed first, including the maximum likelihood (ML) estimator, the restricted ML (REML) estimator, the
uniformly minimum variance unbiased (UMVU) estimator, and a bias-corrected REML estimator. We then propose two estimators
that minimize the asymptotic mean squared error and the asymptotic bias, respectively. A parametric bootstrap procedure is also
described to obtain confidence intervals for the proposed estimators. Both the new estimators and the bootstrap procedure are
very easy to implement. Comparisons of the estimators using simulation studies suggest that our estimators perform better than the
existing ones, and the bootstrap procedure yields confidence intervals with good coverage properties. A real application of estimating
the mean sediment discharge is used to illustrate the methodology.
© 2007 Elsevier B.V.All rights reserved.
Keywords: Maximum likelihood; Parametric bootstrap; Mean squared error; Uniformly minimum variance unbiased; Sediment discharge
1. Introduction
The prevalence of log-normality has been reported in a wide range of applications from mining (Marcotte and
Groleau, 1997), insurance reserves estimation (Doray, 1996), water quality control (Gilliom and Helsel, 1986), to air
pollution concentration monitoring (Holland et al., 2000) and sediment discharge estimation (Cohn, 1995; Elliott and
Anders, 2004), to name just a few. Log-normal linear models are often used in these applications, in which linear models
are fitted to logarithmic transformed response variables. To fix ideas, let Z = (Z
1
,...,Z
n
)
T
be the log-normal response
vector, and x
i
= (1,x
i 1
,...,x
ip
)
T
be the covariate vector for observation i. A log-normal linear model assumes that
Y = log(Z) = X + , (1)
where X = (x
1
,...,x
n
)
T
, = (
0
,
1
,...,
p
)
T
, and = (
1
,...,
n
)
T
with
i
i.i.d.
∼ N(0,
2
).
In many cases, for a new set of covariate values x
0
, one is interested in predicting the response variable at the original
scale,
Z
0
= exp(x
T
0
+
0
),
∗
Corresponding author. Tel./fax: +1 919 843 2431.
E-mail address: zhuz@email.unc.edu (Z. Zhu).
0378-3758/$ - see front matter © 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.jspi.2006.10.016