Short communication RepeatAround: A software tool for finding and visualizing repeats in circular genomes and its application to a human mtDNA database Ana Goios a,b , Jose ´ Meirinhos a , Ricardo Rocha c,d , Ricardo Lopes c,d , Anto ´ nio Amorim a,b , Luı ´sa Pereira a, * a IPATIMUP (Instituto de Patologia e Imunologia Molecular da Universidade do Porto), R. Dr. Roberto Frias, s/n, 4200-465 Porto, Portugal b Faculdade de Cie ˆncias da Universidade do Porto, Porto, Portugal c LIACC (Laborato ´ rio de Intelige ˆncia Artificial e Cie ˆncias de Computadores), Portugal d Departamento de Cie ˆncias de Computadores, Faculdade de Cie ˆncias da Universidade do Porto, Portugal Received 22 March 2006; received in revised form 9 May 2006; accepted 7 June 2006 Available online 14 June 2006 Abstract RepeatAround is a Windows based software tool designed to find ‘‘direct repeats’’, ‘‘inverted repeats’’, ‘‘mirror repeats’’ and ‘‘com- plementary repeats’’, from 3 to 64 bp length, in circular genomes. It processes input files directly extracted from GenBank database, pro- viding visualisation of the repeats location in the genomic structure, so that for instance, in most mtDNAs the user can check if the repeats are located in coding or non-coding region (and in the first case in which gene), and how far apart the repeat pair(s) are. Besides the visual tool, it provides other outputs in a spreadsheet containing information on the number and location of the repeats, facilitating graphic analyses. Several genomes can be inputed simultaneously, for phylogenetic comparison purposes. Other capabilities of the soft- ware are the generation of random circular genomes, for statistical evaluation of comparison between observed repeats distributions with their shuffled counterparts, as well as the search for specific motifs, allowing an easy confirmation of repeats flanking a newly detected rearrangement. As an example of the programme’s applications we analysed the Direct Repeats distribution in a large human mtDNA database. Results showed that Direct Repeats, even the larger ones, are evenly distributed among the human mtDNA haplogroups, enabling us to state that, based only on the repetitive motifs, no haplogroup is particularly more or less prone to mtDNA macrodeletions. Ó 2006 Elsevier B.V. and Mitochondria Research Society. All rights reserved. Keywords: Direct repeats; Inverted repeats; Mirror repeats; Complementary repeats; Circular genomes 1. Introduction Genomes are interspersed by repeated sequence motifs, which can be classified in four types: direct (e.g. AGTTC/AGTTC), inverted (AGTTC/GAACT), mirror (AGTTC/CTTGA) and complementary (AGTTC/TCAA G). These repeated motifs are potential places for the occurrence of gross genome rearrangements, such as deletions and duplications, leading to a variety of malfunc- tions and diseases. This is the etiology of many human pathologies, both in mitochondrial DNA (Brandon et al., 2005; Samuels et al., 2004) and in the nuclear genome, most of them somatic, and sometimes leading to cancer (Chuzhanova et al., 2003). Recently, Samuels (2004) showed that life span of mammalian species can be con- strained by the size of the longest direct repeats present in their mitochondrial genome. The high number of available softwares for repeat finding [e.g. REPuter (Kurtz and Schleiermacher, 1999); GCG-package (http://www.accelrys.com); DnkSet_Demo (http://www.dnkset.com)] also testifies the importance of these repeats for genomes. Nevertheless, some of these packages characteristics do limit its application for repeat finding in mtDNA genomes: (1) many do not work in 1567-7249/$ - see front matter Ó 2006 Elsevier B.V. and Mitochondria Research Society. All rights reserved. doi:10.1016/j.mito.2006.06.001 * Corresponding author. Tel.: +351 225570700; fax: +351 225570799. E-mail address: lpereira@ipatimup.pt (L. Pereira). www.elsevier.com/locate/mito Mitochondrion 6 (2006) 218–224