Biology Reference
In-Depth Information
3 Difference Between MAFFT and MUSCLE
MUSCLE [ 7 , 8 ] is another high-performance MSA program. It
adopted the overall design of the NW-NS-i option of MAFFT ( see
Subheading 2.2 ). Other options corresponding to NW-NS-1 and
NW-NS-2 ( see Subheading 2.1 ) can be selected by specifying the
number of iterations. The accuracies of these options are close to
the corresponding options of MAFFT. However, MUSCLE and
MAFFT have several differences in the scoring system, the
weighting system, and so on. Among these, MUSCLE made a
great contribution to this area by introducing an approximate
tree-building algorithm with a time complexity of O ( N 2 ), where
N is the number of sequences. At that time, this algorithm was
remarkably faster than those used by other programs. Then this
algorithm was subsequently adopted by MAFFT [ 39 ] and the
Clustal series [ 40 ]. MAFFT made a slight modification such that
the resulting tree is exactly identical to that by the standard
method. Due to this modification, the tree-building step is slightly
faster in MUSCLE than in MAFFT without the PartTree option.
4 Dot Plot
All the options in MAFFT assume that there are no genomic
rearrangements (translocations or inversions). By default, MAFFT
uses an algorithm to accelerate a group-to-group alignment calcu-
lation with the FFT algorithm [ 1 ]. It first finds highly conserved
regions and then aligns remaining regions using DP as shown in
Fig. 2 . Thus MAFFT can align long DNA sequences more effi-
ciently than normal DP, if a number of highly conserved regions
are found. Genomic rearrangements can result in conserved regions
that appear in an inconsistent order. In such a case, DP has to be
applied almost directly. It sometimes takes impractically long time,
and the result does not make sense.
To avoid such cases, the web version of MAFFT displays dot
plots between the first sequence and the remaining sequences,
using the LAST local alignment program [ 41 ], for every nucleotide
alignment run. By viewing the dot plots, a user can easily check for
genomic rearrangements and the directions of input sequences.
Some examples are shown in Fig. 4 . If a plot like d is returned
by the server, the calculation should be re-run with the “Adjust
direction” option (for the web version) or with the --adjust-
direction option (for the command-line version), as noted in the
next section. If a more complicated plot, like e, is returned, other
tools that assume genomic rearrangements should be applied,
4.1
Example
 
Search WWH ::




Custom Search