Citation: Caroni, C. Regression
Models for Lifetime Data: An
Overview. Stats 2022, 5, 1294–1304.
https://doi.org/10.3390/
stats5040078
Academic Editor: Wei Zhu
Received: 6 November 2022
Accepted: 3 December 2022
Published: 7 December 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the author.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
Regression Models for Lifetime Data: An Overview
Chrys Caroni
Department of Mathematics, National Technical University of Athens, 157 80 Athens, Greece; ccar@math.ntua.gr
Abstract: Two methods dominate the regression analysis of time-to-event data: the accelerated
failure time model and the proportional hazards model. Broadly speaking, these predominate in
reliability modelling and biomedical applications, respectively. However, many other methods
have been proposed, including proportional odds, proportional mean residual life and several
other “proportional” models. This paper presents an overview of the field and the concept behind
each of these ideas. Multi-parameter modelling is also discussed, in which (in contrast to, say, the
proportional hazards model) more than one parameter of the lifetime distribution may depend on
covariates. This includes first hitting time (or threshold) regression based on an underlying latent
stochastic process. Many of the methods that have been proposed have seen little or no practical use.
Lack of user-friendly software is certainly a factor in this. Diagnostic methods are also lacking for
most methods.
Keywords: lifetime data; regression; proportional hazards; proportional odds; mean residual life;
median residual life; proportional reversed hazards; accelerated failure time; first hitting time
1. Introduction
The purpose of the present paper is to give an overview of the several forms of
regression models that have been proposed for use when the dependent variable is the time
until the occurrence of an event, with simple examples being the death of a patient (survival
analysis) and the failure of a machine (reliability modelling). A basic form of statistical
model regresses the value y of a continuous dependent (response) random variable Y on the
values of a vector of covariates (predictors or explanatory variables) x recorded for the same
statistical unit. The standard example of regression is of course the general linear model
Y = β
′
x + ǫ, (1)
where β is a vector of regression coefficients and the random error term ǫ in the stan-
dard model follows the Normal distribution with zero mean and constant variance σ
2
,
ǫ ∼ N(0, σ
2
). Taking, as usual, the values of x as fixed (not random), this implies that the
conditional distribution of Y given x is also normal,
Y|x ∼ N( β
′
x, σ
2
). (2)
This model is not suitable for application to lifetime data; the fact that the dependent
variable—usually time or a proxy for time, such as the distance run by a vehicle—is non-
negative demands a special approach. The same models also apply to other cases of
non-negative dependent variables besides time, such as the load that can be applied to a
sample of material before it breaks. The regression models examined here are all intended
for use with dependent variables of these types. There will be no attempt to review each
model in the sense of covering all the developments, as they are far too numerous to permit
this. For example, it will be assumed here that the covariates x are measured at the baseline
(time origin of the study), whereas a necessary practical extension of any regression model
is also to allow for time-varying covariates. However, this extension does not alter the
Stats 2022, 5, 1294–1304. https://doi.org/10.3390/stats5040078 https://www.mdpi.com/journal/stats