REPORT Ancient Substructure in Early mtDNA Lineages of Southern Africa Chiara Barbieri, 1,7, * Ma ´rio Vicente, 3,4 Jorge Rocha, 4,5 Sununguko W. Mpoloka, 6 Mark Stoneking, 2 and Brigitte Pakendorf 1,8 Among the deepest-rooting clades in the human mitochondrial DNA (mtDNA) phylogeny are the haplogroups defined as L0d and L0k, which are found primarily in southern Africa. These lineages are typically present at high frequency in the so-called Khoisan populations of hunter-gatherers and herders who speak non-Bantu languages, and the early divergence of these lineages led to the hypothesis of ancient genetic substructure in Africa. Here we update the phylogeny of the basal haplogroups L0d and L0k with 500 full mtDNA genome sequences from 45 southern African Khoisan and Bantu-speaking populations. We find previously unreported subhaplogroups and greatly extend the amount of variation and time-depth of most of the known subhaplogroups. Our major finding is the definition of two ancient sublineages of L0k (L0k1b and L0k2) that are present almost exclusively in Bantu-speaking populations from Zambia; the presence of such relic haplogroups in Bantu speakers is most probably due to contact with ancestral pre-Bantu populations that harbored different lineages than those found in extant Khoisan. We suggest that although these populations went extinct after the immigration of the Bantu-speaking populations, some traces of their haplogroup composition survived through incorporation into the gene pool of the immigrants. Our findings thus provide evidence for deep genetic substructure in southern Africa prior to the Bantu expansion that is not represented in extant Khoisan populations. Sub-Saharan Africa harbors the deepest-rooting lineages of human mitochondrial DNA (mtDNA), in agreement with an African origin of modern humans supported by both fossil and genetic evidence. 1–4 Several studies concurred in placing the root of the mtDNA phylogeny in the southern half of the continent, 5–7 and two deep-rooting clades of this phylogeny—haplogroups L0d and L0k— have been unanimously associated with so-called Khoisan populations. 6–9 The generic term ‘‘Khoisan’’ covers hunter- gatherer and pastoralist populations of southern Africa who speak non-Bantu indigenous languages and share some linguistic features (one of the most characteristic being the heavy use of click consonants in their lan- guages); however, these similarities might be the effect of contact. 10 Haplogroups L0d and L0k are present nearly exclusively in Khoisan populations and neighboring Bantu-speaking populations that have been in docu- mented close contact with them; 11–14 the only known exceptions are sporadic occurrences of haplogroup L0d in East Africa (e.g., in the Sandawe from Tanzania) 7 and in an individual from Yemen 6 as well as an individual from Kuwait 6 who belongs to haplogroup L0k. Specialists recognize three independent language families among Khoisan, namely Tuu, Kx’a, and Khoe-Kwadi, 15–17 which are spoken by a large number of different ethnolinguistic groups comprising both foragers and pastoralists. The forager populations of the central Kalahari, who speak languages belonging to the Tuu and Kx’a families, are assumed to be the descendants of autochthonous Late Stone Age populations, whereas the Khoe-Kwadi languages may have been brought to the area by pastoralist popu- lations around 2,000 years ago. 18–20 The populations speaking Bantu languages, in contrast, are known for their expansion over almost half the African continent and are associated with the concomitant spread of the Bantu language family, an agricultural lifestyle, and iron technology. 3,21,22 Archeological data suggest that they may have reached southern Africa not earlier than 2,000– 1,200 years ago, 3,23,24 where they met populations who were probably ancestral to current Khoisan populations. The most recent comprehensive study that focused on the deepest-rooting lineages of the mtDNA phylogeny was undertaken by Behar et al., 6 who analyzed a total of 624 full mtDNA sequences belonging to haplogroup L*(xM,N). Although this was the first substantial collection of complete mtDNA genome sequences from Africa, some limitations arose from the inclusion of a large number of sequences from diverse published sources that were not always of high quality; furthermore, for some sequences the source population or the country of origin was not clearly specified. Nevertheless, the sequences considered in that study still represent the vast majority of the hap- logroup L*(xM,N) data set included in the most recent version of Phylotree (Build 15, September 2012 25 ), a com- prehensive database of mtDNA genome sequences that is periodically updated when more data become available. 1 Max Planck Research Group on Comparative Population Linguistics, 2 Department of Evolutionary Genetics, MPI for Evolutionary Anthropology, Leipzig 04103, Germany; 3 STAB VIDA, Investigac ¸a ˜o e Servic ¸os em Cie ˆncias Biolo ´ gicas, Lda, Oeiras 2780-182, Portugal; 4 CIBIO, Centro de Investigac ¸a ˜o em Bio- diversidade e Recursos Gene ´ticos da Universidade do Porto, Vaira ˜o 4485-661, Portugal; 5 Departamento de Biologia, Faculdade de Cie ˆncias da Universidade do Porto, Porto 4169-007, Portugal; 6 Department of Biological Sciences, University of Botswana, Gaborone UB 0022, Botswana 7 Present address: Department of Evolutionary Genetics, MPI for Evolutionary Anthropology, Leipzig 04103, Germany 8 Present address: Laboratoire Dynamique du Langage, UMR5596, CNRS and Universite ´ Lyon Lumie `re 2, Lyon 69007, France *Correspondence: chiara_barbieri@eva.mpg.de http://dx.doi.org/10.1016/j.ajhg.2012.12.010. Ó2013 by The American Society of Human Genetics. All rights reserved. The American Journal of Human Genetics 92, 1–8, February 7, 2013 1 Please cite this article in press as: Barbieri et al., Ancient Substructure in Early mtDNA Lineages of Southern Africa, The American Journal of Human Genetics (2013), http://dx.doi.org/10.1016/j.ajhg.2012.12.010