Biology Reference
In-Depth Information
If large sequences are to be aligned, program run time becomes
an issue. Similar as with more traditional alignment methods, the
run time of DIALIGN for pairwise alignment is proportional to the
product of the lengths of the input sequences [ 16 ]. This is too slow
to align large genomic sequences. To speed up the program run
time, a previously developed anchoring option proved to be useful.
2 Anchored Alignment
Most MSA methods are fully automated and do not require any
human intervention. The input from the user is restricted to select-
ing a set of input sequences and to choose the necessary parameter
values, e.g., for gap penalties . In most cases, default parameter
values are used which have been found useful in the typical situa-
tions where a program is used.
Automated alignment is clearly required where no further
information about the input sequences is available. Also, if large
data sets are to be processed and manual intervention would be too
time consuming, automated MSA is mandatory. It should be clear,
however, that the accuracy of automatic methods for sequence
analysis is fundamentally limited. At best, they can produce align-
ments with a (near-)optimal quality score in some mathematical
sense. But there can be no guarantee that mathematically optimal
or high-scoring alignments are biologically meaningful.
The standard version of DIALIGN is fully automated, i.e. like
other MSA methods, it works without human intervention. The
only input parameter is a threshold T for the quality of the local
similarities considered for alignment. Often, however, an expert
user has already some information about (putative) homologies
among the input sequences. In this case, it is desirable to force an
MSA program to align these homologies and to align only the
remainder of the sequences in the usual automatic fashion.
For this reason, DIALIGN has an option for anchored align-
ment where MSAs are produced in a semi-automatic way [ 17 , 18 ].
With this option, the user can select parts of the input sequences
that are to be aligned to each other. The final alignment produced
by DIALIGN can then be seen as an extension of this user-specified
alignment anchor . In more detail, the user selects equal-length pairs
of sequence segments that will end up aligned to each other with-
out gaps. Such pairs of segments are called anchor points . In gen-
eral, it may not be possible to align all of the specified anchor points
in one single output alignment, so it may be necessary to discard
some of the user-defined anchor points. Therefore, the user has to
assign score to each anchor point determining their priority in case
not all anchor points can be used.
In addition to including expert knowledge in otherwise auto-
matically produced MSAs,
the anchored-alignment option in
Search WWH ::




Custom Search