Aligning Multiple Protein Sequences by Hybrid Clonal Selection Algorithm with Insert-Remove-Gaps and BlockShuffling Operators - Artificial Immune Systems

Information Technology Reference

In-Depth Information

4

The State-of-Art Methods for MSA

The most popular method to solve MSA is based on Dynamic Programming (DP)

[9], which guarantees a mathematically optimal alignment. However, the method

is limited to a small number of short sequences, since the size of the problem

space increases with the number of sequences and their length. To overcome this

problem several heuristic approaches, based on different strategies, have been

developed to effectively deal with the computational complexity of the problem.

All current methodologies of multiple alignment are heuristics and can be

classify in three main categories: progressive alignments , exact algorithms and

iterative alignments .

Progressive Alignments. In progressive alignment methods multiple alignments

are performed, first aligning the closest sequences and then the more distant

ones are added. Although this approach has the advantage of being simplistic

and very fast, it does not guarantee any level of optimization.

Therefore, the main drawback of this approach is that once a sequence

has been aligned it cannot be modified, causing possible conflicts with suc-

cessively added sequences. Alignment programs based on this approach are

MULTALIGN [10], PILEUP [11], CLUSTALX [12], CLUSTALW [13], T-

COFFEE [14]. Their strategy is to align sequences in a progressive manner us-

ing a consistency-based objective function in order to minimize possible errors.

SPEM (sequence and Secondary-structure Profiles Enhanced Multiple align-

ment) [15] combines a sequence-based method with a consistency-based refine-

ment for pairwise alignment, a progressive algorithm for multiple alignment and

PROBCONS [16] a practical tool for progressive protein multiple sequence

alignment based on probabilistic consistency which is a novel scoring function

for measuring alignment quality.

Exact algorithms. In contrast to the previous approach, PIMA [17] uses local

dynamic programming to align only the most conserved motifs. In the default

setting it makes use of two alignment methods, maximum linkage and sequen-

tial branching, to decide the order of alignments ML PIMA and SB PIMA

respectively. Exact algorithms were developed to align multiple sequences simul-

taneously [18]. High memory requirement, high computational effort and limi-

tation on the number of sequences limit their usage. A new divide and conquer

algorithm [19] extending their capabilities was developed.

Iterative alignments. Iterative alignment methods depend on algorithms able

to produce an alignment and to refine it through a series of iterations until

no further improvements can be made. They are based on the idea that the

solution to a given problem can be computed by modifying an already existing

sub-optimal solution . Aligners which are based on this approach are:

-

DIALIGN [20,21], a consistency-based algorithm which attempts to use lo-

cal information in order to guide a global alignment, i.e. to construct multiple

alignments based on segment-to-segment comparisons — such segments are

incorporated into a multiple alignment using an iterative procedure.

Artificial Immune Systems

Search WWH ::

Custom Search

Home