Purchase-Frequency Bias in Random-Coefficients Brand-Choice Models Anand V. BODAPATI Anderson Graduate School of Management, University of California, Los Angeles, CA 90095 (bodapati@ucla.edu) Sachin GUPTA Johnson Graduate School of Management, Cornell University, Ithaca, NY 14853 (sg248@cornell.edu) Conventional random-coefficients models of conditional brand choice using panel data ignore the de- pendence of the random-coefficients distribution on the purchase frequencies. We show that this leads to biased estimates and propose a conditional likelihood approach to obtain unbiased estimates. Unlike alter- native approaches that require observation of “no-purchase” occasions, our proposed method relies only on purchase data. Furthermore, our approach does not require that the researcher specify the distribution of purchase frequencies. As a result, estimates of the brand-choice model are unaffected by misspecification of the model of purchase frequencies. We demonstrate the performance of the proposed approach in sim- ulated data and in scanner data. We find that results differ substantively from the conventional latent-class model in terms of segment membership probabilities, segment characteristics, and price elasticities. KEY WORDS: Choice model; Estimation method; Latent class model; Multinomial logit; Panel data; Random coefficient; Unobserved heterogeneity. 1. INTRODUCTION A large body of marketing literature is based on statistical modeling of consumer brand-choice behavior using household panel data. In panel data we typically observe varying numbers of category purchases made by households during any observed time window. These differences in numbers of purchases re- flect differences in category consumption (e.g., heavy buyers vs. light buyers) as well as the data-capture mechanism. For in- stance, panel data gathered via the frequent-shopper card of a particular retailer are likely to show more occurrences of pur- chases of consumers who are loyal to that retailer’s stores. In this article we examine the implications of these differences in observed purchase frequencies of panelists on the modeling of brand-choice behavior. Typically, researchers who only are interested in brand- choice behavior develop brand-choice models that account for heterogeneity in choice parameters between panelists, but ig- nore the process that generates differences in purchase frequen- cies. A partial list of such models includes those of Kamakura and Russell (1989), Chintagunta, Jain, and Vilcassim (1991), Gonul and Srinivasan (1993), and Fader and Hardie (1996). We show in this article that if the purchase frequency of a household is not independent of the parameters describing its choice prob- abilities, then current estimation approaches lead to biased esti- mates of the parameters of the choice model. We term this bias “purchase frequency bias.” In numerical simulations we show that the magnitude of this bias may be quite severe. Our result applies to both the primary models of unobserved heterogeneity prevalent in the literature—the Gaussian continuous random- effects model, and the finite-mixture or latent-class model. Should we expect the purchase frequencies and brand-choice parameters to be dependent across panelists? Findings in the literature suggest this is indeed the case. For the canned tuna fish category, Kim and Rossi (1994) reported the following: “Our most striking finding is that consumers with high purchase frequency or high purchase volume are much more price sen- sitive and have more sharply defined preferences for national brands than consumers with low frequency or low volume of purchase.” Dillon and Gupta (1996) found a positive relation- ship between purchase frequency and brand-choice price sen- sitivity in the paper towel category. In other purchase contexts, the relationship between purchase frequency and price sensitiv- ity may be negative. For example, a segmentation study of cus- tomers of Mobil Oil (Forsyth, Gupta, Haldar, Kaul, and Kettle 1999) found that price shoppers spent an average of $700 an- nually, whereas price-insensitive, heavier users spent as much as $1,200. Similarly, Chib, Seetharaman, and Strijnev (2004) found that the parameters explaining category purchases are correlated with the parameters explaining brand-choice deci- sions. To illustrate the purchase frequency bias, we consider the following simple brand-choice structure. On each purchase occasion, consumers choose between two brands, A and B. Consumers are heterogeneous in their preferences for the two brands and belong to one of two homogeneous segments. Each consumer in segment 1 has a .30 probability of choosing brand A on each purchase occasion, whereas each consumer in segment 2 has a .70 probability of choosing brand A. For simplification, we omit marketing variables from this example. The two segments have an equal number of consumers. How- ever, consumers in segment 1, who prefer brand B, tend to be light buyers and make four purchases in the category during the period studied in the panel data, whereas consumers in seg- ment 2, who prefer brand A, are heavy buyers and make eight purchases. The question of interest is as follows: If a latent-class model of brand choice (e.g., Kamakura and Russell 1989) is estimated on these data, would we recover the true brand preferences of each segment and the relative sizes of the segments? Surpris- ingly, the answer turns out to be “no.” Based on simulated data © 2005 American Statistical Association Journal of Business & Economic Statistics October 2005, Vol. 23, No. 4 DOI 10.1198/073500104000000569 473