Research paper Exceptionally long 5UTR short tandem repeats specically linked to primates P. Namdar-Aligoodarzi, S. Mohammadparast 1 , B. Zaker-Kandjani 1 , S. Talebi Kakroodi, M. Jafari Vesiehsari, M. Ohadi Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran abstract article info Article history: Received 1 February 2015 Received in revised form 12 May 2015 Accepted 13 May 2015 Available online xxxx Keywords: Short tandem repeat 5Untranslated region Primate-specic Speciation Adaptive evolution We have previously reported genome-scale short tandem repeats (STRs) in the core promoter interval (i.e. -120 to +1 to the transcription start site) of protein-coding genes that have evolved identically in primates vs. non-primates. Those STRs may function as evolutionary switch codes for primate speciation. In the current study, we used the Ensembl database to analyze the 5untranslated region (5UTR) between +1 and +60 of the transcription start site of the entire human protein-coding genes annotated in the GeneCards database, in order to identify exceptionally longSTRs (5-repeats), which may be of selective/adaptive advantage. The importance of this critical interval is its function as core promoter, and its effect on transcription and translation. In order to minimize ascertainment bias, we analyzed the evolutionary status of the human 5UTR STRs of 5-repeats in several species encompassing six major orders and superorders across mammals, including primates, rodents, Scandentia, Laurasiatheria, Afrotheria, and Xenarthra. We introduce primate-specic STRs, and STRs which have expanded from mouse to primates. Identical co-occurrence of the identied STRs of rare average frequency between 0.006 and 0.0001 in primates supports a role for those motifs in processes that diverged primates from other mammals, such as neuronal differentiation (e.g. APOD and FGF4), and craniofacial development (e.g. FILIP1L). A number of the identied STRs of 5-repeats may be human-specic (e.g. ZMYM3 and DAZAP1). Future work is warranted to examine the importance of the listed genes in primate/human evolution, development, and disease. © 2015 Elsevier B.V. All rights reserved. 1. Introduction The emerging comparative genomics studies on short tandem repeats (STRs) support a role of those motifs in primate speciation (Ohadi et al., 2015; Rezazadeh et al., 2014; Mohammadparast et al. 2014). STRs have recently been shown to have a prominent role in epistasis (Press et al., 2014). In view of the role of epistasis as the prima- ry factor in molecular evolution (Breen et al., 2012), and the potential of STRs to contract or expand, it may be speculated that STRs provide more effective evolutionary codes necessary for adaptive evolution than the quaternary codes provided by DNA nucleotide blocks (G, A, T, and C) in non-repetitive DNA sequences. Indeed, selection could shape STRs into tuning knobsthat facilitate evolutionary adaptation (King et al., 2006). This possibility is consistent with evolutionary conserva- tion of STRs, in genes with neurological and neurodevelopmental functions (Darvish et al., 2013; Bolton et al. 2013; King, 2012; Heidari et al., 2011; Zarif Yeganeh et al., 2010). Indeed, genes driven by repeat-containing promoters show signicantly higher rates of tran- scriptional divergence, and variations in repeat length result in changes in expression and local nucleosome positioning, where substitution of the repeats with identical length of non-repetitive DNA does not restore gene expression activity (Vinces et al., 2009). In a genome-scale analysis of the entire human protein-coding genes annotated in the GeneCards database, we have recently reported a cata- log of core promoters i.e. interval between -120 to + 1 of the transcrip- tion start site (TSS), containing exceptionally longSTRs of 6-repeats (Ohadi et al., 2012a). At the top of that list, the PAXBP1 core promoter contains the longest STR identied in a human gene core promoter. This STR is functional and has been expanded exceptionally in primates (Mohammadparast et al. 2014), indicating that exceptionally long STRs may confer selective advantage and adaptation in primates. Remarkably, PAXBP1 is involved in processes that have critically diverged from non- primates to primates, such as craniofacial features (Paternoster et al., 2012) and spine morphogenesis (Guerreiro et al., 2013). On the list of the exceptionally long human core promoter STRs, CYTH4 contains the longest tetra-nucleotide STR in its core promoter. This functional Gene xxx (2015) xxxxxx Abbreviations: STR, short tandem repeat; TF, transcription factor; TSS, transcription start site; UTR, untranslated region. Corresponding author. E-mail address: ohadi.mina@yahoo.com (M. Ohadi). 1 Equal contribution. GENE-40549; No. of pages: 7; 4C: http://dx.doi.org/10.1016/j.gene.2015.05.053 0378-1119/© 2015 Elsevier B.V. All rights reserved. Contents lists available at ScienceDirect Gene journal homepage: www.elsevier.com/locate/gene Please cite this article as: Namdar-Aligoodarzi, P., et al., Exceptionally long 5UTR short tandem repeats specically linked to primates, Gene (2015), http://dx.doi.org/10.1016/j.gene.2015.05.053