Biology Reference
In-Depth Information
Efficient algorithms employed in biological sequence alignment
have been around since the early 1970s. Over the decades, many
improvements and modifications have been achieved. However, the
main methodology enabling sequence alignment has stayed as the
kernel of the programs. Dynamic programming is an efficient com-
putation scheme applied on a class of problems that can be solved
recursively. Defining the alignment problem in such manner enables
the computation of optimal alignments in polynomial time and
memory for a pair of sequences.
Dynamic programming offers a feasible optimal solution for
pairwise alignments. However, an exact application becomes
impractical when multiple sequences are considered. A similar
practical issue is observed when queries are needed to be aligned
against large databases. Several heuristics are proposed for approxi-
mate solutions of the corresponding problems. Yet, the idea behind
most of the popular alignment heuristics is to approximate the
solution by dividing the problem into subpairwise alignment pro-
blems. In that sense, it is possible to say the idea of aligning two
sequences using dynamic programming is a main building block of
the multiple sequence alignment and database search algorithms.
In this chapter, we first review the pairwise sequence alignment
problem and the polynomial time solution offered by dynamic
programming. The minor variations on the algorithms to find
global, semi-global, local, and near-optimal alignments are intro-
duced. We discuss the fundamental algorithmic complexity reduc-
tions and heuristics to provide fast approximate solutions. The final
issue we consider is the relation of multiple sequence alignment and
dynamic programming.
2
The Pairwise Alignment Problem
To pose the problem and the solution program, perhaps the
introduction of pairwise sequence alignment problem is a prefera-
ble starting point because of a couple of reasons. Firstly, the
multiple sequence alignment is generally defined as a direct gener-
alization of pairwise sequence alignment. Therefore, the pairwise
solution offered can also be generalized to the multiple sequence
case. Secondly, the practical applications of multiple sequence align-
ment approximate the solution considering the pairwise alignment
of sequences. A multiple alignment is generally met by extending
the pairwise alignments to a consensus of multiple alignment.
An alignment of two biological sequences is simply an ordered
mapping. Let's say X and Y are two sequences and each element of
a sequence is called a residue. Assume the i th residue of X , X i ,is
paired with Y j . Then, a residue coming after X i cannot match up
with a residue of Y coming before Y j . Moreover, a residue can be
matched with at most one residue and the matchings are not
bijective. If a residue is not matched with another residue, it is
Search WWH ::




Custom Search