Quantifying semantic shift for reconstructing language families William Croft Clayton Beckner Logan Sutton University of New Mexico Jon Wilkins Tanmoy Bhattacharya Daniel Hruschka Santa Fe Institute Abstract In comparative historical linguistics, one must weigh evidence from large numbers of putative cognates in order to arrive at the best hypothesis of the family tree and reconstructions. The comparativist presently uses unquantified knowledge of these processes. We present a typological study of word polysemy in order to construct a quantified network of semantic similarity among basic vocabulary items for comparative historical research. We investigate 22 concepts denoting natural objects in the Swadesh list across a typological sample of over 50 languages. In addition to its value for comparative historical linguistics, the study also reveals universals of lexical conceptual space. 1. Introduction The need ‘…historical linguistics cannot ignore semantic change. For unless we can relate words such as Old English hlāf ‘bread’ and New English loaf not only phonetically but also semantically, it is impossible to trace many historical developments and to do meaningful historical linguistic research’ (Hock 1986:284). The problem ‘there seem to be no natural constraints on the directions and results of semantic change. Given enough imagination—and daring—it is possible to claim semantic relationship for almost any two words under the sun.’ (Hock 1986:308) ‘There is…little in semantic change which bears any relationship to regularity in phonological change’ (Fox 1995:111) The status quo ‘If the correspondences are regular, the set of words is cognate, however unlikely the semantics. That is, structural grounds—regular correspondences—are sufficient for establishing cognacy, while semantic grounds are neither necessary nor sufficient. (Nichols 1996:57, describing a ‘working assumption’ of the comparative method)