Risk Factors, Confounding, and the Illusion of Statistical Control
NICHOLAS J. S. CHRISTENFELD,PHD, RICHARD P. SLOAN,PHD, DOUGLAS CARROLL,PHD, AND SANDER GREENLAND,DRPH
Abstract: When experimental designs are premature, impractical, or impossible, researchers must rely on statistical methods to
adjust for potentially confounding effects. Such procedures, however, are quite fallible. We examine several errors that often follow
the use of statistical adjustment. The first is inferring a factor is causal because it predicts an outcome even after “statistical control”
for other factors. This inference is fallacious when (as usual) such control involves removing the linear contribution of imperfectly
measured variables, or when some confounders remain unmeasured. The converse fallacy is inferring a factor is not causally
important because its association with the outcome is attenuated or eliminated by the inclusion of covariates in the adjustment
process. This attenuation may only reflect that the covariates treated as confounders are actually mediators (intermediates) and
critical to the causal chain from the study factor to the study outcome. Other problems arise due to mismeasurement of the study
factor or outcome, or because these study variables are only proxies for underlying constructs. Statistical adjustment serves a useful
function, but it cannot transform observational studies into natural experiments, and involves far more subjective judgment than
many users realize. Key words: confounds, risk factors, statistical control, mediators, covariates.
BP = blood pressure; SES = socioeconomic status; MI = myocar-
dial infarction.
INTRODUCTION
I
n exploring risk factors for various diseases, we are often
forced, by timing, economics, or ethics, to use nonexperi-
mental designs. These designs bring with them numerous
interpretational problems, including the issue of confounding.
People who drink more coffee may also smoke more ciga-
rettes and drink more alcohol (1). Determining whether coffee
drinking itself increases mortality risk, and is not just a marker
for some other causal factor, must be approached not by
random assignment, but by statistical means. The basic tech-
nique is to include measures of potential confounders as
regressors (covariates) in a regression model, or stratify the
data on these confounders. People then say they have “statis-
tically controlled” or adjusted for the potential confounders.
There are many tasks that adjustment performs well. In
experimental designs, covariate adjustment can reduce the
noise in outcome variation, and thus allow the manipulation
effect to stand out more clearly. Statistical adjustments per-
form markedly less well at the epidemiologic tasks to which
they are regularly put. They simply cannot convert nonexperi-
ments to experiments because “statistical control” is funda-
mentally distinct from experimental control (2,3). For exam-
ple, successful randomization tends to minimize confounding
by unmeasured as well as measured factors, whereas statistical
control addresses only confounding by what has been mea-
sured and can introduce confounding and other biases through
inappropriate control (2,4 – 6). We shall briefly examine, with
examples, unjustified conclusions that can follow adjustment
for potential confounders, such as inferring that something is
a causal risk factor because it predicts an outcome even after
“adjustment” for possible confounders, and inferring that a
factor is not causally important because its impact is markedly
attenuated or eliminated by the inclusion of covariates, as can
happen when one adjusts for intermediate variables, or medi-
ators (4,7). By causal risk factor, we mean that if this factor
were altered, the outcome would be altered, whereas a marker
is predictive but not necessarily causal, and its manipulation
need not affect the outcome variable.
Such issues are treated in detail in certain epidemiologic
texts (8,9) but seem to be underappreciated in behavioral
medicine research. There are other, more subtle dangers in the
use of covariates that we will not discuss here but can be
found treated in some detail elsewhere (2– 6,9).
Statistical Control: Necessary but Not Sufficient
It is fairly easy to find risk factors for premature morbidity
or mortality (10). Indeed, given a large enough study and
enough measured factors and outcomes, almost any poten-
tially interesting variable will be linked to some health out-
come. Many of these associations will be chance artifacts, but
some will represent replicable phenomena. Discovering such
associations is useful if one’s goal is simply to predict disease.
Even when not directly causal, associations can help target
groups for health education or screening. For example, it is
probably more useful to publish information about Tay-Sachs
screening in B’nai B’rith Magazine than to publish it in
Christianity Today. The difficulty comes, of course, when one
wants to move beyond simple prediction into health interven-
tion, or primary prevention; this requires that we distinguish
between a marker of a disease condition and an actual causal
risk factor. It would be one thing to find that B’nai B’rith
Magazine readers are more likely to be carriers of Tay-Sachs;
it would be another to suggest that canceling their subscrip-
tions would help. The problem, of course, is that magazine
subscription status is associated with many antecedent factors
that are related to the Tay-Sachs gene, and so is confounded
by these factors.
To examine the possibility that a particular factor is not
causal, but just a marker for a causal factor, a researcher
would include other known or plausible risk factors as covari-
ates and determine whether adjustment for these potential
From the Department of Psychology (N.J.S.C.), University of California,
San Diego, La Jolla, California; Department of Psychiatry, Columbia Uni-
versity (R.P.S.), New York, New York; School of Sport & Exercise Sciences,
University of Birmingham (D.C.), Birmingham, England; and the Department
of Epidemiology, University of California Los Angeles (S.G.), Los Angeles,
California.
Address correspondence and reprint requests to Nicholas Christenfeld,
PhD, Department of Psychology, University of California, San Diego, La
Jolla, CA 92093-0109. E-mail: nicko@ucsd.edu
Received for publication December 4, 2003; revision received May 17,
2004.
DOI: 10.1097/01.psy.0000140008.70959.41
STATISTICAL CORNER
868 Psychosomatic Medicine 66:868 – 875 (2004)
0033-3174/04/6606-0868
Copyright © 2004 by the American Psychosomatic Society