Application of the Burrows-Wheeler Transform for Searching for Approximate Tandem Repeats Agnieszka Danek 1 , Rafa l Pokrzywa 1 , Izabela Maka lowska 2 , and Andrzej Pola´ nski 1 1 Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland agnieszka.danek@polsl.pl 2 Laboratory of Bioinformatics, Faculty of Biology, Adam Mickiewicz University, Umultowska 89, 61-614 Pozna´ n, Poland Abstract. Tandem repeats (TRs) are contiguous copies of repeating patterns, which may be either exact or approximate. Approximate tan- dem repeats (ATRs) in a genomic sequences are adjacent copies of a re- peating pattern of nucleotides, where similarity is defined by a suitable measure. Both TRs and ATRs are used in forensic analysis, DNA map- ping, testing for inherited diseases and many evolutionary studies. All their functions and roles are not well defined and remains a subject of ongoing investigation. However, growing biological databases together with tools to look for such repeats may lead to better understanding of their behavior. This paper presents our method for searching for ATRs defined on the basis of the model of substitution mutations and its com- parison to two other tools. The capabilities and limitations of methods are analyzed and results obtained with each tool are investigated. Keywords: approximate tandem repeats, Burrows-Wheeler transform, suffix array, Hamming distance. 1 Introduction Tandem repeats (TRs) are consecutive, repeating patterns in genomic sequences. TRs belong to the most important loci in genomes due to their abundance in DNA sequences and to their role both in evolution and in molecular mechanism of functioning of organisms. Evolution of tandem repeats loci is governed by a mechanism called slippage mutation, e.g. [1], which due to its high intensity belongs to the major factors of genomic dynamics. A very important issue is the dynamics of interaction between slippage and point mutation, which is still an area of an intensive research [2], [3], [4]. As for functional roles of TRs in cel- lular mechanisms there is a lot of evidence proving linkage of TRs to important molecular processes in cells. TRs play important roles in the gene expression and transcription regulations [5]. They are also widely used as markers for DNA mapping and DNA fingerprinting [7]. It is well known that when TRs are occur- ring in increased, abnormal number, they cause a series of inherited diseases [6] (i.e. trinucleotide repeat disorders). T. Shibuya et al. (Eds.): PRIB 2012, LNBI 7632, pp. 255–266, 2012. c Springer-Verlag Berlin Heidelberg 2012