Purchase-Frequency Bias in
Random-Coefficients Brand-Choice Models
Anand V. BODAPATI
Anderson Graduate School of Management, University of California, Los Angeles, CA 90095 (bodapati@ucla.edu)
Sachin GUPTA
Johnson Graduate School of Management, Cornell University, Ithaca, NY 14853 (sg248@cornell.edu)
Conventional random-coefficients models of conditional brand choice using panel data ignore the de-
pendence of the random-coefficients distribution on the purchase frequencies. We show that this leads to
biased estimates and propose a conditional likelihood approach to obtain unbiased estimates. Unlike alter-
native approaches that require observation of “no-purchase” occasions, our proposed method relies only
on purchase data. Furthermore, our approach does not require that the researcher specify the distribution of
purchase frequencies. As a result, estimates of the brand-choice model are unaffected by misspecification
of the model of purchase frequencies. We demonstrate the performance of the proposed approach in sim-
ulated data and in scanner data. We find that results differ substantively from the conventional latent-class
model in terms of segment membership probabilities, segment characteristics, and price elasticities.
KEY WORDS: Choice model; Estimation method; Latent class model; Multinomial logit; Panel data;
Random coefficient; Unobserved heterogeneity.
1. INTRODUCTION
A large body of marketing literature is based on statistical
modeling of consumer brand-choice behavior using household
panel data. In panel data we typically observe varying numbers
of category purchases made by households during any observed
time window. These differences in numbers of purchases re-
flect differences in category consumption (e.g., heavy buyers
vs. light buyers) as well as the data-capture mechanism. For in-
stance, panel data gathered via the frequent-shopper card of a
particular retailer are likely to show more occurrences of pur-
chases of consumers who are loyal to that retailer’s stores. In
this article we examine the implications of these differences in
observed purchase frequencies of panelists on the modeling of
brand-choice behavior.
Typically, researchers who only are interested in brand-
choice behavior develop brand-choice models that account for
heterogeneity in choice parameters between panelists, but ig-
nore the process that generates differences in purchase frequen-
cies. A partial list of such models includes those of Kamakura
and Russell (1989), Chintagunta, Jain, and Vilcassim (1991),
Gonul and Srinivasan (1993), and Fader and Hardie (1996). We
show in this article that if the purchase frequency of a household
is not independent of the parameters describing its choice prob-
abilities, then current estimation approaches lead to biased esti-
mates of the parameters of the choice model. We term this bias
“purchase frequency bias.” In numerical simulations we show
that the magnitude of this bias may be quite severe. Our result
applies to both the primary models of unobserved heterogeneity
prevalent in the literature—the Gaussian continuous random-
effects model, and the finite-mixture or latent-class model.
Should we expect the purchase frequencies and brand-choice
parameters to be dependent across panelists? Findings in the
literature suggest this is indeed the case. For the canned tuna
fish category, Kim and Rossi (1994) reported the following:
“Our most striking finding is that consumers with high purchase
frequency or high purchase volume are much more price sen-
sitive and have more sharply defined preferences for national
brands than consumers with low frequency or low volume of
purchase.” Dillon and Gupta (1996) found a positive relation-
ship between purchase frequency and brand-choice price sen-
sitivity in the paper towel category. In other purchase contexts,
the relationship between purchase frequency and price sensitiv-
ity may be negative. For example, a segmentation study of cus-
tomers of Mobil Oil (Forsyth, Gupta, Haldar, Kaul, and Kettle
1999) found that price shoppers spent an average of $700 an-
nually, whereas price-insensitive, heavier users spent as much
as $1,200. Similarly, Chib, Seetharaman, and Strijnev (2004)
found that the parameters explaining category purchases are
correlated with the parameters explaining brand-choice deci-
sions.
To illustrate the purchase frequency bias, we consider the
following simple brand-choice structure. On each purchase
occasion, consumers choose between two brands, A and B.
Consumers are heterogeneous in their preferences for the two
brands and belong to one of two homogeneous segments.
Each consumer in segment 1 has a .30 probability of choosing
brand A on each purchase occasion, whereas each consumer
in segment 2 has a .70 probability of choosing brand A. For
simplification, we omit marketing variables from this example.
The two segments have an equal number of consumers. How-
ever, consumers in segment 1, who prefer brand B, tend to be
light buyers and make four purchases in the category during
the period studied in the panel data, whereas consumers in seg-
ment 2, who prefer brand A, are heavy buyers and make eight
purchases.
The question of interest is as follows: If a latent-class model
of brand choice (e.g., Kamakura and Russell 1989) is estimated
on these data, would we recover the true brand preferences of
each segment and the relative sizes of the segments? Surpris-
ingly, the answer turns out to be “no.” Based on simulated data
© 2005 American Statistical Association
Journal of Business & Economic Statistics
October 2005, Vol. 23, No. 4
DOI 10.1198/073500104000000569
473