T. O. Oladele et al.
African Scientist Vol. 10, No. 1 March 31, 2009 1595-6881/2009 $12.00 + 0.00
Printed in Nigeria © 2009 Klobex Academic Publishers
http://www.klobex.org/afs
AFS 2008077/10102
On efficiency of sequence alignment algorithms
T. O. Oladele
1
, O. M. Bamigbola
2
and C. O. Bewaji
3
1
Department of Computer Science, University of Ilorin, Ilorin. Nigeria.
2
Department of Mathematics, University of Ilorin, Ilorin. Nigeria.
3
Department of Biochemistry, University of Ilorin, Ilorin. Nigeria.
(Received December 31, 2008)
ABSTRACT: The alignment of sequences is a mutual arrangement of two or more sequences which exhibit their
similarities and where they differ.
Sequence alignment is usually used to study the evolution of the sequences from a common ancestor, especially
biological sequences such as protein sequences or deoxyribonucleic acid (DNA) sequences. Sequence alignment can
also be used to study the evolution of languages and the similarity between texts.
In this paper, we discuss sequence alignment algorithms for finding similarities between sequences, homologues
(relatives) on a gene or gene-product in genomic databases. This information is useful for answering a variety of
biologically related questions.
Key Words: Sequence algorithms; Bioinformatics; Protein sequences; DNA sequences
1. INTRODUCTION
Sequence alignment has become the central tool for sequence comparison in molecular biology. The
typical assumption in the use of alignment is the mechanism of molecular evolution. DNA carries over
genetic material from generation to generation, by virtue of its semi-conservation duplication mechanism.
Change in the material are introduced by occasional errors and mutations in the duplication, and by viruses
and other mechanisms which sometimes move sub-sequences within the chromosome and between
individuals.
In bioinformatics, a sequence alignment is a way of arranging the primary sequences of DNA, RNA or
protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary
relationships between the sequences (Lesk, 2002). Aligned sequences of nucleotide or amino acid residues
are represented as rows within a matrix. Gaps are inserted between the residues so that residues with
identical or similar characters are aligned in successive columns.
Very short or similar sequences can be aligned by hand. Most interesting problems require the
alignment of lengthy, highly variable or extremely numerous sequences that cannot be aligned solely by
human effort. Instead, human knowledge is primarily applied in constructing algorithms to produce high-
quality sequence alignments, and occasionally, in adjusting the final results to reflect patterns that are
difficult to represent algorithmically. A lot of computational algorithms, such as Basic Local Alignment
Search Tool (BLAST) and Fast Alignment (FASTA) have been applied to sequence alignment problem,
9