RESEARCH ARTICLE Berislav Lisnic´ Æ Ivan-Kresˇimir Svetec Hrvoje S ˇ aric´ Æ Ivan Nikolic´ Æ Zoran Zgaga Palindrome content of the yeast Saccharomyces cerevisiae genome Received: 10 February 2005 / Accepted: 20 February 2005 / Published online: 18 March 2005 Ó Springer-Verlag 2005 Abstract Palindromic sequences are important DNA motifs involved in the regulation of different cellular processes, but are also a potential source of genetic instability. In order to initiate a systematic study of palindromes at the whole genome level, we developed a computer program that can identify, locate and count palindromes in a given sequence in a strictly defined way. All palindromes, defined as identical inverted re- peats without spacer DNA, can be analyzed and sorted according to their size, frequency, GC content or alphabetically. This program was then used to prepare a catalog of all palindromes present in the chromosomal DNA of the yeast Saccharomyces cerevisiae. For each palindrome size, the observed palindrome counts were significantly different from those in the randomly gen- erated equivalents of the yeast genome. However, while the short palindromes (2–12 bp) were under-repre- sented, the palindromes longer than 12 bp were over- represented, AT-rich and preferentially located in the intergenic regions. The 44-bp palindrome found between the genes CDC53 and LYS21 on chromosome IV was the longest palindrome identified and contained only two C-G base pairs. Avoidance of coding regions was also observed for palindromes of 4–12 bp, but was less pronounced. Dinucleotide analysis indicated a strong bias against palindromic dinucleotides that could ex- plain the observed short palindrome avoidance. We discuss some possible mechanisms that may influence the evolutionary dynamics of palindromic sequences in the yeast genome. Keywords Palindrome Æ Inverted repeat Æ Dinucleotide Æ Saccharomyces cerevisiae Æ Sequence analysis Introduction Closely spaced inverted repeats (IRs), palindromes and quasipalindromes can be found in the DNA of natural plasmids, viral and bacterial genomes and eukaryotic chromosomes and organelles. In prokaryotes, they may serve as binding sites for regulatory proteins, while short perfect palindromes are known as recognition sites for type II restriction-modification systems (RMSs) that play a significant role in bacterial ecology and evolution (Gelfand and Koonin 1997; Rocha et al. 2001). Another important property of such motifs is their potential to form intra-strand hydrogen bonds within DNA mole- cules or in corresponding RNA transcripts. Therefore, they are contained in genes encoding functional RNA molecules, the structure of which depends on the for- mation of proper intra-strand bonding, and in different cis-acting genetic elements, like terminators, attenuators, plasmid and viral origins of replication. Protein binding and secondary structure formation are also modes of action for IRs and related motifs in eukaryotic cells. For example, palindromes with a spacer of one nucleotide were identified in yeast sequences regulating cellular re- sponse to the accumulation of unfolded proteins in the endoplasmic reticulum (Mori et al. 1998) and a hetero- dimeric complex was isolated that binds two palindromic sequences in the promoter region of the human erbB-2 gene (Chen and Gill 1996). In mouse B lymphoma cells, palindromic and potential stem-loop motifs were iden- tified as break-points during class switch recombination (Tashiro et al. 2001); and the formation of intra-strand secondary structures is essential in the process of im- munoglobuline gene rearrangement known as V(D)J- joining (Cuomo et al. 1996). However, in spite of their importance and functional versatility, longer palindromes and IRs were shown to be Communicated by S. Hohmann B. Lisnic´ Æ I.-K. Svetec Æ Z. Zgaga (&) Faculty of Food Technology and Biotechnology, University of Zagreb, Pierottijeva 6, 10000 Zagreb, Croatia E-mail: zgazo@pbf.hr Tel.: +385-1-4836013 Fax: +385-1-4836016 H. S ˇ aric´ Æ I. Nikolic´ Sail Company Croatia Ltd., Ilica 412, 10000 Zagreb, Croatia Curr Genet (2005) 47: 289–297 DOI 10.1007/s00294-005-0573-5