Research Article
Received 12 August 2011, Accepted 24 March 2012 Published online in Wiley Online Library
(wileyonlinelibrary.com) DOI: 10.1002/sim.5417
Testing goodness of fit in regression:
a general approach for
specified alternatives
Aldo Solari,
a
*
†
Saskia le Cessie
b,c
and Jelle J. Goeman
c
When fitting generalized linear models or the Cox proportional hazards model, it is important to have tools to
test for lack of fit. Because lack of fit comes in all shapes and sizes, distinguishing among different types of lack
of fit is of practical importance. We argue that an adequate diagnosis of lack of fit requires a specified alternative
model. Such specification identifies the type of lack of fit the test is directed against so that if we reject the null
hypothesis, we know the direction of the departure from the model. The goodness-of-fit approach of this paper
allows to treat different types of lack of fit within a unified general framework and to consider many existing tests
as special cases. Connections with penalized likelihood and random effects are discussed, and the application of
the proposed approach is illustrated with medical examples. Tailored functions for goodness-of-fit testing have
been implemented in the R package globaltest. Copyright © 2012 John Wiley & Sons, Ltd.
Keywords: goodness of fit; logistic regression; generalized linear models; Cox proportional hazards model
1. Introduction
A goodness-of-fit test addresses the question: ‘Is there evidence of inconsistency of data with a statistical
model?’ However, a departure from the model may happen in different directions. A linear model, for
example, may fail because transformations of covariates are required, or because interaction effects have
been missed, or both. Distinguishing the different types of lack of fit is of practical importance: if we
find evidence against the model, we generally also want to know why the model does not fit.
We argue that an adequate diagnosis of lack of fit requires the specification of the alternative model.
There are two aspects to this. Firstly, the alternative represents the type of lack of fit of interest and will
result in a test statistic sensitive to it, because general criteria such as the Neyman–Pearson theory are
applicable. Secondly, the alternative model can be fitted and interpreted, giving some guide as to the type
of lack of fit that may be present. A goodness-of-fit test should, therefore, be specific about the type of
lack of fit it is directed against.
However, most of the goodness-of-fit tests in routine use and provided in standard software either
leave the alternative model unspecified or formulate a very particular alternative. An example is the
Hosmer–Lemeshow [1] test for logistic regression. It is unspecified against which type of lack of fit it is
especially sensitive, and there is no alternative model to fit. An example of a test with a very particular
alternative is the F test for the simple linear model against quadratic regression. This very specific alter-
native has the drawback that more complex relationships between the response and the covariate may
be overlooked.
a
Department of Statistics, University of Milano-Bicocca, via Bicocca degli Arcimboldi 8, 20126 Milan, Italy
b
Department of Clinical Epidemiology, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, The Netherlands
c
Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden,
The Netherlands
*Correspondence to: Aldo Solari, Department of Statistics, University of Milano-Bicocca, via Bicocca degli Arcimboldi 8,
20126 Milan, Italy.
†
E-mail: aldo.solari@unimib.it
Copyright © 2012 John Wiley & Sons, Ltd. Statist. Med. 2012