Mapping Phonetic Variation in the Newcastle Electronic Corpus of Tyneside English Hermann Moisl September 21, 2011 The Newcastle Electronic Corpus of Tyneside English (Necte) is a sample of dialect speech from Tyneside in North-East England (Corrigan et al 2006; Allen et al. 2007). Jones-Sargent (1983), Moisl and Jones (2005), and Moisl, Maguire and Allen (2006) used cluster analysis to show that the speakers who constitute the earlier of the two chronological strata in the corpus fall into distinct groups defined by relative frequency of usage of phonetic segments, and Moisl and Maguire (2008) went on to identify the main phonetic determinants of that grouping by comparing cluster centroids. The present discussion develops these findings by constructing a map which comprehensively describes the pattern of phonetic variation across the Necte speakers, and, in combination with the earlier studies just cited, is intended as a contribution to a methodology for corpus-based mathematical and statistical study of language variation. The discussion is in two main parts: the first part briefly describes Necte, the second constructs the phonetic variation map. 1 The Newcastle Electronic Corpus of Tyne- side English Necte is a corpus of dialect speech from Tyneside in North-East England, shown as the boxed area in Figure 1. It is based on two pre-existing corpora of audio-recorded speech, one of them gathered in the late 1960s by the Tyneside Linguistic Survey (Tls) (Strang 1968; Pellowe et al. 1972), and the other between 1991 and 1994 by the Phonological Variation and Change in Contemporary Spoken English (PVC) project (Milroy et al. 1994). This discussion, like the earlier ones cited in the Introduction, deals with the Tls component of Necte only, which is henceforth referred to as Necte/Tls. Necte/Tls includes phonetic transcriptions of each of 64 recordings which the Tls team produced with the aim of determining whether systematic pho- netic variation among Tyneside speakers of the period could be significantly correlated with variation in their social characteristics. To this end the Tls developed a methodology which was radical at the time and remains so today: in contrast to the then-universal and still-dominant theory driven approach, where social and linguistic factors are selected by the analyst on the basis of some combination of an independently-specified theoretical framework, existing 1