Purchase-Frequency Bias in Random-Coefﬁcients Brand-Choice Models Anand V. BODAPATI Anderson Graduate School of Management, University of California, Los Angeles, CA 90095 (bodapati@ucla.edu) Sachin GUPTA Johnson Graduate School of Management, Cornell University, Ithaca, NY 14853 (sg248@cornell.edu) Conventional random-coefﬁcients models of conditional brand choice using panel data ignore the de- pendence of the random-coefﬁcients distribution on the purchase frequencies. We show that this leads to biased estimates and propose a conditional likelihood approach to obtain unbiased estimates. Unlike alter- native approaches that require observation of “no-purchase” occasions, our proposed method relies only on purchase data. Furthermore, our approach does not require that the researcher specify the distribution of purchase frequencies. As a result, estimates of the brand-choice model are unaffected by misspeciﬁcation of the model of purchase frequencies. We demonstrate the performance of the proposed approach in sim- ulated data and in scanner data. We ﬁnd that results differ substantively from the conventional latent-class model in terms of segment membership probabilities, segment characteristics, and price elasticities. KEY WORDS: Choice model; Estimation method; Latent class model; Multinomial logit; Panel data; Random coefﬁcient; Unobserved heterogeneity. 1. INTRODUCTION A large body of marketing literature is based on statistical modeling of consumer brand-choice behavior using household panel data. In panel data we typically observe varying numbers of category purchases made by households during any observed time window. These differences in numbers of purchases re- ﬂect differences in category consumption (e.g., heavy buyers vs. light buyers) as well as the data-capture mechanism. For in- stance, panel data gathered via the frequent-shopper card of a particular retailer are likely to show more occurrences of pur- chases of consumers who are loyal to that retailer’s stores. In this article we examine the implications of these differences in observed purchase frequencies of panelists on the modeling of brand-choice behavior. Typically, researchers who only are interested in brand- choice behavior develop brand-choice models that account for heterogeneity in choice parameters between panelists, but ig- nore the process that generates differences in purchase frequen- cies. A partial list of such models includes those of Kamakura and Russell (1989), Chintagunta, Jain, and Vilcassim (1991), Gonul and Srinivasan (1993), and Fader and Hardie (1996). We show in this article that if the purchase frequency of a household is not independent of the parameters describing its choice prob- abilities, then current estimation approaches lead to biased esti- mates of the parameters of the choice model. We term this bias “purchase frequency bias.” In numerical simulations we show that the magnitude of this bias may be quite severe. Our result applies to both the primary models of unobserved heterogeneity prevalent in the literature—the Gaussian continuous random- effects model, and the ﬁnite-mixture or latent-class model. Should we expect the purchase frequencies and brand-choice parameters to be dependent across panelists? Findings in the literature suggest this is indeed the case. For the canned tuna ﬁsh category, Kim and Rossi (1994) reported the following: “Our most striking ﬁnding is that consumers with high purchase frequency or high purchase volume are much more price sen- sitive and have more sharply deﬁned preferences for national brands than consumers with low frequency or low volume of purchase.” Dillon and Gupta (1996) found a positive relation- ship between purchase frequency and brand-choice price sen- sitivity in the paper towel category. In other purchase contexts, the relationship between purchase frequency and price sensitiv- ity may be negative. For example, a segmentation study of cus- tomers of Mobil Oil (Forsyth, Gupta, Haldar, Kaul, and Kettle 1999) found that price shoppers spent an average of $700 an- nually, whereas price-insensitive, heavier users spent as much as $1,200. Similarly, Chib, Seetharaman, and Strijnev (2004) found that the parameters explaining category purchases are correlated with the parameters explaining brand-choice deci- sions. To illustrate the purchase frequency bias, we consider the following simple brand-choice structure. On each purchase occasion, consumers choose between two brands, A and B. Consumers are heterogeneous in their preferences for the two brands and belong to one of two homogeneous segments. Each consumer in segment 1 has a .30 probability of choosing brand A on each purchase occasion, whereas each consumer in segment 2 has a .70 probability of choosing brand A. For simpliﬁcation, we omit marketing variables from this example. The two segments have an equal number of consumers. How- ever, consumers in segment 1, who prefer brand B, tend to be light buyers and make four purchases in the category during the period studied in the panel data, whereas consumers in seg- ment 2, who prefer brand A, are heavy buyers and make eight purchases. The question of interest is as follows: If a latent-class model of brand choice (e.g., Kamakura and Russell 1989) is estimated on these data, would we recover the true brand preferences of each segment and the relative sizes of the segments? Surpris- ingly, the answer turns out to be “no.” Based on simulated data © 2005 American Statistical Association Journal of Business & Economic Statistics October 2005, Vol. 23, No. 4 DOI 10.1198/073500104000000569 473