9
Additive Models, Trees, and Related
Methods
In this chapter we begin our discussion of some specific methods for super-
vised learning. These techniques each assume a (different) structured form
for the unknown regression function, and by doing so they finesse the curse
of dimensionality. Of course, they pay the possible price of misspecifying
the model, and so in each case there is a tradeoff that has to be made. They
take off where Chapters 3–6 left off. We describe five related techniques:
generalized additive models, trees, multivariate adaptive regression splines,
the patient rule induction method, and hierarchical mixtures of experts.
9.1 Generalized Additive Models
Regression models play an important role in many data analyses, providing
prediction and classification rules, and data analytic tools for understand-
ing the importance of different inputs.
Although attractively simple, the traditional linear model often fails in
these situations: in real life, effects are often not linear. In earlier chapters
we described techniques that used predefined basis functions to achieve
nonlinearities. This section describes more automatic flexible statistical
methods that may be used to identify and characterize nonlinear regression
effects. These methods are called “generalized additive models.”
In the regression setting, a generalized additive model has the form
E(Y |X
1
,X
2
,...,X
p
)= α + f
1
(X
1
)+ f
2
(X
2
)+ ··· + f
p
(X
p
). (9.1)
© Springer Science + Business Media, LLC 2009
T. Hastie et al., The Elements of Statistical Learning, Second Edition, 295
DOI: 10.1007/b94608_9,