Principles of parametric estimation in modeling language competition Menghan Zhang a,1 and Tao Gong b,1,2 a Institute of Linguistics, Shanghai Normal University, Shanghai 200234, China; and b Department of Linguistics, University of Hong Kong, Hong Kong Edited by Barbara H. Partee, South College, Amherst, MA, and approved April 30, 2013 (received for review February 16, 2013) It is generally difcult to dene reasonable parameters and inter- pret their values in mathematical models of social phenomena. Rather than directly tting abstract parameters against empirical data, we should dene some concrete parameters to denote the sociocultural factors relevant for particular phenomena, and com- pute the values of these parameters based upon the corresponding empirical data. Taking the example of modeling studies of lan- guage competition, we propose a language diffusion principle and two language inheritance principles to compute two critical param- eters, namely the impacts and inheritance rates of competing languages, in our language competition model derived from the LotkaVolterra competition model in evolutionary biology. These principles assign explicit sociolinguistic meanings to those parame- ters and calculate their values from the relevant data of population censuses and language surveys. Using four examples of language competition, we illustrate that our language competition model with thus-estimated parameter values can reliably replicate and predict the dynamics of language competition, and it is especially useful in cases lacking direct competition data. prestige | Fouriers law of heat conduction | Hardy-Weinberg genetic inheritance principle | logistic curve | lexical diffusion dynamics H ow to dene informative parameters in mathematical models of real-world phenomena remains a tough problem; in par- ticular, how to assign explicit meanings to parameters and inter- pret their values in models of social phenomena critically affects the explanatory power of these models. This issue becomes more serious in recent modeling studies of language dynamics (15), especially competition (the process whereby local tongues are being replaced by hegemonic languages due to population mi- gration and sociocultural exchange) (6). Among the numerous modeling approximations of two-language competition (715), the most inuential one was the Abrams and Strogatz (AS) model (8). It dened prestige (the socioeconomic status of the speakers of a language) of competing languages to determine the dynamics of language competition, and reported well-tting curves to some historical data under a xed range of prestige value. However, this abstract parameter lacked explicit sociocultural meanings; it remained unclear what were the characteristics of a language having a prestige value, say 1.2, and what was the sociocultural condition corresponding to the dif- ference between two languages having prestige values, say 1.2 and 1.3, respectively. Lacking such empirical foundations, the prestige value had to be obtained via curve tting, thus making this model useless in cases lacking sufcient empirical data. Al- though many recent models (915) extended the AS model in certain aspects [e.g., the Mira and Paredes (MP) model (9) incorporated bilinguals into competition, the Stauffer and Schulze (SS) model (10) adopted network structures to conne language contact, and the Minett and Wang (MW) model (11) revealed the possibility of preserving endangered languages by en- hancing their relatively small prestige values], most of them kept using prestige in their discussions of language competition and pertinent issues. Language competition is subject to many socio- cultural constraints, among which the primary ones include the population sizes of competing languages, the geographical distances between these populations, and the nonuniform population dis- tributions in competing regions (5, 13, 1620). Prestige alone fails to explicitly address these many factors, and applying xed prestige values in different cases of language competition ap- parently disregards the actual conditions of those cases. Noting these, we dene two concrete parameters, namely the impacts and inheritance rates of competing languages, and adopt the LotkaVolterra competition model (2123) in evolutionary biology to study the dynamics of language competition. Mean- while, we propose a language diffusion principle and two lan- guage inheritance principles to calculate these parameters based on the relevant data of population censuses and language surveys. The language diffusion principle, inspired by Fouriers law of heat conduction, computes the impacts of competing languages from the population sizes of these languages and the geographical distances between the region where competition occurs and the population centers of these languages. The empirical data for this calculation are available in population censuses and geographical information systems. Language inheritance principle I, inspired by the HardyWeinberg genetic inheritance principle (24, 25), computes the inheritance rates of competing languages based on the occurring frequencies of these languages during language learning. Both monolinguals and bilinguals are taken into ac- count, and the empirical data for this calculation can be extracted from the surveys of speakerslanguage choices in communities. Language inheritance principle II, inspired by the well-attested lexical diffusion dynamics (26, 27), adopts the logistic curve (28) to estimate the inheritance rates of competing languages. This makes the principle applicable in cases lacking sufcient data of speakerslanguage choices. Following these principles, the calcu- lated parameter values can clearly indicate the inuence of those primary factors on language competition. Based on our language competition model, in practice, rather than curve tting, we rst explicitly compute the values of these parameters and then use our model with thus-estimated parameter values to replicate the dy- namics of language competition in particular cases of language competition. Based on language inheritance principle II, our model can also reasonably predict the dynamics of language competition in cases that lack direct competition data. Materials and Methods Language Competition Model. When multiple languages come into contact, one or more of them may become endangered because speakers may prefer using the others. Such competition can be viewed as a process where these languages gain survival advantage via resource plunder. Resource here refers to the speakers in the competing region; the survival advantage of a lan- guage manifests primarily in its number of speakers in this region, and the competition dynamics is reected mainly by the change in the population Author contributions: M.Z. and T.G. designed research; M.Z. performed research; M.Z. and T.G. analyzed data; and T.G. wrote the paper. The authors declare no conict of interest. This article is a PNAS Direct Submission. 1 M.Z. and T.G. contributed equally to this work. 2 To whom correspondence should be addressed. E-mail: gtojty@gmail.com. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1303108110/-/DCSupplemental. 96989703 | PNAS | June 11, 2013 | vol. 110 | no. 24 www.pnas.org/cgi/doi/10.1073/pnas.1303108110