Why are grammatical elements more evenly dispersed than lexical elements? Assessing the roles of text frequency and semantic generality Martin Hilpert 1 and David Correia Saavedra 1 Abstract Grammatical elements such as determiners, conjunctions or pronouns are very evenly dispersed across natural language data. By contrast, the uses of lexical elements have a stronger tendency to occur in bursts that are interspersed by long lulls. This paper considers two alternative explanations for this difference. First, it could be hypothesised that the more even distribution of grammatical elements is merely an effect of an element’s high text frequency. Alternatively, it could be argued that a more even distribution is a symptom of greater generality in meaning. In order to assess the impact of both frequency and semantic generality, we conducted a corpus-based study that contrasts lexical and grammatical elements in Present-Day English. Our results suggest that evenness of dispersion is chiefly an effect of high frequency. Keywords: abstractness, deviation of proportions, dispersion, distributional semantics, grammaticalisation. 1. Introduction Highly grammaticalised elements, such as determiners (the, an), conjunctions (and, because) or pronouns (she, yours), are not only very frequent in running text, but they also tend to be very evenly dispersed. A randomly chosen sentence from a book written in English is very likely to contain the determiner the, and crucially, so are the following sentences. By contrast, lexical items, or content words, do not attain the same level of text frequency, and they usually show a distribution that is characterised by bursts 1 Université de Neuchâtel, Institut de langue et littérature anglaises, Espace Louis-Agassiz 1, CH-2000 Neuchâtel, Switzerland. Correspondence to: Martin Hilpert, e-mail: martin.hilpert@unine.ch Published in Corpora 12, issue 3, 369-392, 2017, which should be used for any reference to this work 1