Review Identication of RNA polymerase III-transcribed genes in eukaryotic genomes Giorgio Dieci a, , Anastasia Conti a , Aldo Pagano b, c , Davide Carnevali a a Dipartimento di Bioscienze, Università degli Studi di Parma, Parco Area delle Scienze 23/A, 43124 Parma, Italy b Dipartimento di Medicina Sperimentale, Università degli Studi di Genova, Italy c IRCCS AOU San Martino, IST, Genova, Italy abstract article info Article history: Received 30 August 2012 Received in revised form 20 September 2012 Accepted 21 September 2012 Available online 2 October 2012 Keywords: RNA polymerase III ncRNA tRNA SINE TFIIIC ChIP-seq The RNA polymerase (Pol) III transcription system is devoted to the production of short, generally abundant non- coding (nc) RNAs in all eukaryotic cells. Previously thought to be restricted to a few housekeeping genes easily detectable in genome sequences, the set of known Pol III-transcribed genes (class III genes) has been expanding in the last ten years, and the issue of their detection, annotation and actual expression has been stimulated and revived by the results of recent high-resolution genome-wide location analyses of the mammalian Pol III machin- ery, together with those of Pol III-centered computational studies and of ncRNA-focused transcriptomic ap- proaches. In this article, we provide an outline of distinctive features of Pol III-transcribed genes that have allowed and currently allow for their detection in genome sequences, we critically review the currently practiced strategies for the identication of novel class III genes and transcripts, and we discuss emerging themes in Pol III transcription regulation which might orient future transcriptomic studies. This article is part of a Special Issue entitled: Transcription by Odd Pols. © 2012 Elsevier B.V. All rights reserved. 1. Introduction The RNA polymerase (Pol) III transcription machinery is devoted to the production of non-protein coding (nc) RNAs of small size, whose transcription units are frequently present in multiple copies in eukary- otic genomes. About 400 transcription units are targeted by the RNA po- lymerase (Pol) III transcription machinery in the Saccharomyces cerevisiae genome, a number that approaches 1000 in the human and mouse genomes [1]. The most abundant products of Pol III-dependent transcription are the different species of tRNAs, functionally differing from each other for the ability to charge different amino acids corre- sponding to different anticodons, and the 5S rRNA, generally encoded by one or a few hundreds of identical transcription units. These RNAs, together with the Pol I-synthesized 5.8S, 28S and 18S rRNAs, are funda- mental components of the protein synthesis machinery. In addition to the abundant ncRNAs involved in translation, whose synthesis represents the major contribution to Pol III workload in eukary- otic cells, Pol III has long been known to synthesize a small, heterogeneous set of ncRNAs, that are generally abundant and are involved in different cellular processes, from protein translocation to rRNA and tRNA process- ing. The known set of non-tRNA/non-rRNA genes transcribed by Pol III has remarkably expanded during the last decade, especially thanks to the results of transcriptome analyses, genome-wide location studies of transcription factors and computational searches for Pol III regulatory el- ements in eukaryotic genomes. Recent widening of the known Pol III transcriptome has been previously reviewed [2], but a wealth of signi- cant studies published in the last three years (in particular studies based on chromatin immunoprecipitation followed by sequencing (ChIP-seq) applied to the Pol III machinery in mammalian cells) have further expand- ed it, allowing not only to identify novel class III genes, but also to shed light on unexpected features of Pol III-targeted loci that complicate the view of their transcriptional regulation. Recent reviews have addressed in detail the novel Pol III-related issues revealed by ChIP-seq studies in mammals, including the cell type-specic variation of Pol III occupancy of tRNA genes in spite of their sharing the same core promoters, the over- lap between Pol III and Pol II occupancy at class III gene loci, and the wide- spread occurrence of TFIIIC-only-associated loci not corresponding to Pol III transcription units [1,3]. In this review, we will provide an outline of the computational and the empirical strategies that, historically and up to the most recent studies, have allowed to establish inventories of Pol III-transcribed genes and corresponding transcripts in eukaryotic ge- nomes, pointing out advantages and limitations of the different ap- proaches, as well as critical issues in the structural annotation of the different types of class III genes. We will also provide an update of newly identied Pol III transcripts whose knowledge further contributes to our understanding of the Pol III-dependent gene expression network. 2. Computational search algorithms for Pol III-transcribed genes 2.1. tRNA genes The most striking and easily recognizable DNA sequence feature of class III genes is the presence, in a signicant subset of them in particular in all tRNA genes of two internal control regions, Biochimica et Biophysica Acta 1829 (2013) 296305 This article is part of a Special Issue entitled: Transcription by Odd Pols. Corresponding author. Tel.: +39 0521 905649; fax: +39 0521 905151. E-mail address: giorgio.dieci@unipr.it (G. Dieci). 1874-9399/$ see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.bbagrm.2012.09.010 Contents lists available at SciVerse ScienceDirect Biochimica et Biophysica Acta journal homepage: www.elsevier.com/locate/bbagrm