MAFFT: Iterative Refinement and Additional Methods - Multiple Sequence Alignment Methods - page 125

Biology Reference

In-Depth Information

Fig. 1 Calculation procedure for the progressive options (reprinted from [ 57 ])

MAFFT assumes that the sequences involved in an MSA are

homologous, that is, sharing a single common ancestral sequence.

MAFFT always generates an MSA that has all the letters of the input

sequences. The order of the letters in each sequence is identical to

that of the input sequence, although the sequences can be reor-

dered according to similarity.

The progressive method [ 9 , 10 ] is the most basic MSA algorithm.

A guide tree is created based on all-to-all pairwise comparisons, and

an MSA is constructed using a group-to-group alignment algo-

rithm at each node of the guide tree. To achieve a reasonable

balance between speed and accuracy, MAFFT uses a two-cycle

progressive method, FFT-NS-2, shown in Fig. 1 . First, low-

quality all-pairwise distances are rapidly calculated, based on the

number of shared k mers, to build an initial tree [ 11 ]. k is 6 for both

protein and nucleotide sequences as noted in [ 1 ]. A tentative MSA

is computed using the initial guide tree, and then a refined tree is

built using the tentative MSA. Finally, an MSA is constructed based

on the refined tree. This is the default option

2.1 Progressive

Methods

A faster option, FFT-NS-1, which performs the first cycle only, is

also available.

Next Page

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home