Journal of Statistical Planning and Inference 156 (2015) 80–89
Contents lists available at ScienceDirect
Journal of Statistical Planning and Inference
journal homepage: www.elsevier.com/locate/jspi
Estimation for semiparametric transformation models with
length-biased sampling
Xuan Wang
a
, Qihua Wang
a,b,∗
a
Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, PR China
b
Institue of Statistical Science, Shenzhen University, Shenzhen 518006, PR China
article info
Article history:
Received 20 November 2013
Received in revised form 15 April 2014
Accepted 2 August 2014
Available online 19 August 2014
MSC:
62N01
62N02
62P10
Keywords:
Length-biased data
Right-censored data
Transformation model
Estimating equation
abstract
For length-biased and right-censored data, we propose an estimation method to assess the
effects of risk factors under the semiparametric linear transformation model. Unlike the
existing method of Shen et al. (2009) based on the ranks of observed failure times, the
new estimators are obtained from counting process-based unbiased estimating equations.
Consistency and asymptotic normality for the estimators are derived under suitable
regularity conditions. We evaluate the finite sample performance of the proposed method
and make a comparison with that of Shen et al. (2009) by simulation studies. A real data
example is analyzed to illustrate the proposed method.
© 2014 Elsevier B.V. All rights reserved.
1. Introduction
By terminology, length-biased data are left-truncated and right censored data under the stationary assumption that
the initial times follow a stationary Poisson process. In observational studies, one often encounters length-biased data. A
large number of examples for length-biased data can be found in Qin and Shen (2010) and Shen et al. (2009). Under the
length-biased sampling, the observed samples are not randomly sampled from the population of interest but with
probability proportional to their lengths, which makes the observed time intervals from initiating to failure in the prevalent
cohort tend to be longer than those from the underlying distribution for the general population.
Extensive literature has focused on estimating the unbiased distribution given length-biased data (Wang, 1991;
Asgharian and Wolfson, 2005). There are two main difficulties encountered in analyzing length-biased data. One is that when
studying the effects of risk factors on the population failure time, the model structure assumed for the target population is
often different from the one for the observed length-biased data. The other is that the failure time and right-censoring time
are not independent except in trivial cases. To model risk factors on the distribution of the underlying population, Wang
(1996) described a proportional hazards model for length-biased data and used a biased-adjusted risk set to construct the
pseudo-likelihood for estimation by just ignoring right censoring. Qin and Shen (2010) proposed two estimating equations
to estimate covariate coefficients under the Cox model based on two mean zero processes respectively.
∗
Corresponding author at: Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, PR China.
E-mail address: qhwang@amss.ac.cn (Q. Wang).
http://dx.doi.org/10.1016/j.jspi.2014.08.001
0378-3758/© 2014 Elsevier B.V. All rights reserved.