Journal of Statistical Planning and Inference 156 (2015) 80–89 Contents lists available at ScienceDirect Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi Estimation for semiparametric transformation models with length-biased sampling Xuan Wang a , Qihua Wang a,b,∗ a Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, PR China b Institue of Statistical Science, Shenzhen University, Shenzhen 518006, PR China article info Article history: Received 20 November 2013 Received in revised form 15 April 2014 Accepted 2 August 2014 Available online 19 August 2014 MSC: 62N01 62N02 62P10 Keywords: Length-biased data Right-censored data Transformation model Estimating equation abstract For length-biased and right-censored data, we propose an estimation method to assess the effects of risk factors under the semiparametric linear transformation model. Unlike the existing method of Shen et al. (2009) based on the ranks of observed failure times, the new estimators are obtained from counting process-based unbiased estimating equations. Consistency and asymptotic normality for the estimators are derived under suitable regularity conditions. We evaluate the finite sample performance of the proposed method and make a comparison with that of Shen et al. (2009) by simulation studies. A real data example is analyzed to illustrate the proposed method. © 2014 Elsevier B.V. All rights reserved. 1. Introduction By terminology, length-biased data are left-truncated and right censored data under the stationary assumption that the initial times follow a stationary Poisson process. In observational studies, one often encounters length-biased data. A large number of examples for length-biased data can be found in Qin and Shen (2010) and Shen et al. (2009). Under the length-biased sampling, the observed samples are not randomly sampled from the population of interest but with probability proportional to their lengths, which makes the observed time intervals from initiating to failure in the prevalent cohort tend to be longer than those from the underlying distribution for the general population. Extensive literature has focused on estimating the unbiased distribution given length-biased data (Wang, 1991; Asgharian and Wolfson, 2005). There are two main difficulties encountered in analyzing length-biased data. One is that when studying the effects of risk factors on the population failure time, the model structure assumed for the target population is often different from the one for the observed length-biased data. The other is that the failure time and right-censoring time are not independent except in trivial cases. To model risk factors on the distribution of the underlying population, Wang (1996) described a proportional hazards model for length-biased data and used a biased-adjusted risk set to construct the pseudo-likelihood for estimation by just ignoring right censoring. Qin and Shen (2010) proposed two estimating equations to estimate covariate coefficients under the Cox model based on two mean zero processes respectively. ∗ Corresponding author at: Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, PR China. E-mail address: qhwang@amss.ac.cn (Q. Wang). http://dx.doi.org/10.1016/j.jspi.2014.08.001 0378-3758/© 2014 Elsevier B.V. All rights reserved.