Transpn. Res-A Vol 19A. No. 4. pp. 315-324. 1985 0191-2607/85 $3.00+ 00
Printed in Great Britain © 1985 Pergamon Press Ltd
DISAGGREGATE MODE CHOICE MODELS AND THE
AGGREGATION ISSUE: SOME EMPIRICAL RESULTS
J. P. DUNNE
University of Warwick, Coventry, CV4 7AL, U.K.
(Received 4 April 1983; in revised form 20 Januar3.~ 1985)
Abstract--This paper presents a comparative analysis of the alternative approaches to providing aggregate
prediction models from disaggregate mode choice models. In general, the results support the findings of previous
studies and illustrate the importance of realizing the trade-off between aggregation bias in simple procedures
and the practical problems of more complex approaches.
INTRODUCTION
Disaggregate mode choice models have shown them-
selves to have many advantages over conventional ag-
gregate models. In concentrating the analysis at the level
of the individual behavioural unit. they have allowed
consideration of the factors that influence the travel be-
haviour of individuals and have made more efficient use
of available data. Their consistent theoretical base, de-
veloped from the postulates of consumer rationality and
utility maximization, has allowed disaggregate models a
claim to generality. In addition, their encompassing of
policy-relevant variables has provided them with a po-
tentially more useful role in forecasting than descriptive
aggregate models. (See, for example. De Donnea, 1971;
Domencich and McFadden, 1975: Richards and Ben Ak-
iva, 1975.) While it is desirable to estimate choice models
at a disaggregate level, however, the use of the models
in prediction will generally require some level of aggre-
gation. The transformation of disaggregate models into
aggregate prediction models, although simple in principle
(complete enumeration), does raise practical problems,
as it requires predicted values for each individual in the
sample and therefore has rather extreme data require-
ments. As a result, "short cut" aggregation methods have
been developed that can be distinguished by the way in
which they represent distributions of the explanatory var-
iables across the sample.
The simplest is the naive approach, which uses the
average sample values of the independent variables to-
gether with the disaggregate model coefficient estimates.
This will, however, provide inaccurate predictions as the
average of a nonlinear function is not the same as the
function evaluated at the average values. To overcome
this problem, a number of other approaches have been
developed. These include the classification approach
(Koppelman, 1976), which uses the naive method on
relatively homogeneous subgroups; the statistical differ-
entials approach (Talvitie, 1973, 1976), which uses the
moments of the distribution of probabilities over the pop-
ulation; and the density function approach (Westin, 1974;
Watson and Westin, 1975), which uses a family of dis-
tributions to model the population relative frequency dis-
tribution (RFD) of probabilities.
This paper aims to provide some new empirical evi-
dence on the adequacy of these methods of aggregation
using data from a study of mode choice between Liv-
ingston New Town and Edinburgh. The disaggregate model
is a binary logit model of mode choice for the journey
to work. Although the true test of an "aggregated" pre-
diction model is its adequacy when confronted with new
data, the assessment here is based upon it's representation
of the original sample on which the disaggregate models
were estimated. While this is only a first step in the
assessment of a model to be used for prediction, if a
model were to perform badly at this stage, it would cer-
tainly not be worth using it in more stringent tests. In
addition, the study provides some useful comparative
assessments of the various aggregation methods em-
ployed.
The first section of the paper outlines the available
approaches, the second section reports the results of their
empirical application and the third section presents some
conclusions.
METHODS OF AGGREGATION
Enumeration
The correct approach to aggregation is to calculate the
average probability by estimating each individual's choice
probability and taking the average. In this case, the values
of the independent variables directly relevant to each
individual in the prediction group (complete enumeration)
or a subset of that group (sample enumeration) are used.
Thus, the average probability (P) is
1
= ~xP,, (l)
I
where
e.~/[3
Pi -
1 + e -','~"
13 being the vector of parameter estimates, and x~ is the
individual values of the explanatory variables. Although
providing the most theoretically consistent approach,
315