Information Technology Reference
In-Depth Information
4
The State-of-Art Methods for MSA
The most popular method to solve MSA is based on Dynamic Programming (DP)
[9], which guarantees a mathematically optimal alignment. However, the method
is limited to a small number of short sequences, since the size of the problem
space increases with the number of sequences and their length. To overcome this
problem several heuristic approaches, based on different strategies, have been
developed to effectively deal with the computational complexity of the problem.
All current methodologies of multiple alignment are heuristics and can be
classify in three main categories: progressive alignments , exact algorithms and
iterative alignments .
Progressive Alignments. In progressive alignment methods multiple alignments
are performed, first aligning the closest sequences and then the more distant
ones are added. Although this approach has the advantage of being simplistic
and very fast, it does not guarantee any level of optimization.
Therefore, the main drawback of this approach is that once a sequence
has been aligned it cannot be modified, causing possible conflicts with suc-
cessively added sequences. Alignment programs based on this approach are
MULTALIGN [10], PILEUP [11], CLUSTALX [12], CLUSTALW [13], T-
COFFEE [14]. Their strategy is to align sequences in a progressive manner us-
ing a consistency-based objective function in order to minimize possible errors.
SPEM (sequence and Secondary-structure Profiles Enhanced Multiple align-
ment) [15] combines a sequence-based method with a consistency-based refine-
ment for pairwise alignment, a progressive algorithm for multiple alignment and
PROBCONS [16] a practical tool for progressive protein multiple sequence
alignment based on probabilistic consistency which is a novel scoring function
for measuring alignment quality.
Exact algorithms. In contrast to the previous approach, PIMA [17] uses local
dynamic programming to align only the most conserved motifs. In the default
setting it makes use of two alignment methods, maximum linkage and sequen-
tial branching, to decide the order of alignments ML PIMA and SB PIMA
respectively. Exact algorithms were developed to align multiple sequences simul-
taneously [18]. High memory requirement, high computational effort and limi-
tation on the number of sequences limit their usage. A new divide and conquer
algorithm [19] extending their capabilities was developed.
Iterative alignments. Iterative alignment methods depend on algorithms able
to produce an alignment and to refine it through a series of iterations until
no further improvements can be made. They are based on the idea that the
solution to a given problem can be computed by modifying an already existing
sub-optimal solution . Aligners which are based on this approach are:
-
DIALIGN [20,21], a consistency-based algorithm which attempts to use lo-
cal information in order to guide a global alignment, i.e. to construct multiple
alignments based on segment-to-segment comparisons — such segments are
incorporated into a multiple alignment using an iterative procedure.
Search WWH ::




Custom Search