Multiple Sequence Alignment with DIALIGN - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

DIALIGN can be used to speed up the alignment procedure.

Indeed, if an anchor point enforces alignment of two selected

sequence segments, this reduces the search space of the remaining

automatic alignment procedure (e.g., if the middle positions of two

sequences are used as anchor point, the search spaced for the

pairwise alignment is reduced by a factor of two).

Therefore, the anchoring option was also used to align long

genomic sequences [ 19 , 20 ]. Here, a fast method for local homol-

ogy detection such as BLAST [ 21 ] can be used to find strong

sequence homologies that can then be used as anchor points for

DIALIGN . This approach has been implemented and made avail-

able on our web server [ 19 ]. Our anchored-alignment approach to

genomic sequence comparison has also been used to improve the

performance of gene-finding methods in eukaryotes [ 22 ]. Other

applications of anchored multiple alignment are the possibility to

study the behavior of alignment methods in detail, or the integra-

tion of new algorithmic approaches for multiple alignment instead

of the greedy heuristic used in the standard version of DIALIGN

[ 23 ].

3 DIALIGN-T and DIALIGN-TX

Studies have shown that DIALIGN is often superior to other MSA

tools where sequences with local homologies are aligned. On glob-

ally related sequences with weak primary-sequence similarity, how-

ever, it tends to be outperformed by strictly global methods such as

CLUSTAL W [ 24 ], MUSCLE [ 5 , 25 ], MAFFT [ 4 ], or PROB-

CONS [ 26 ]. One might think that a possible reason for this relative

weakness could be the greedy optimization method used for multi-

ple alignment in DIALIGN . Indeed, it is easy to see that the

heuristic in DIALIGN can produce MSAs with scores far below

the possible optimal MSA. If that would be the reason for the

relative weakness of the program on global, weak homologies,

one would make efforts to find more efficient optimization algo-

rithms, leading to higher-scoring MSAs in the sense of the

fragment-based scoring function used in DIALIGN . This has

been done in the past, e.g., in [ 27 , 28 ]. More recent results based

on anchored alignments indicate, however, that the relative weak-

ness of DIALIGN on global homologies with low similarity at the

primary-sequence level is caused by the underlying objective func-

tion, and not so much by the greedy optimization algorithm. Thus,

MSAs with mathematically higher scores may not necessarily be

more meaningful of biologically. We therefore adopted other

approaches to improve the performance of DIALIGN on those

sequence families where strictly global MSA methods were still

superior. This resulted in the development of DIALIGN-T and

DIALIGN-TX .

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home