MAFFT: Iterative Refinement and Additional Methods - Multiple Sequence Alignment Methods - page 129

Biology Reference

In-Depth Information

These types of methods were intensively studied recently, and

many alternative methods, such as PicXAA-RNA [ 5 ], CentroidA-

lign [ 36 ], and RCoffee [ 37 ], are available.

MAFFT has a subprogram to align two alignments.

2.5 Profile

Alignments

This program is useful only when two alignments are phylo-

genetically separated. Careless application of this method results in

serious misalignments, as shown in [ 38 ] and Subheading 6 .

We are preparing a safer option, --addprofile , to avoid such

mistakes.

This option does not return any result if the sequences in

alignment1 do not form a monophyletic cluster. Thus this method

is not always useful for every user and is still in the testing phase.

To align a large number of sequences, MAFFT has an approximate

option, PartTree [ 39 ], which skips the calculation of the full dis-

tance matrix consisting of O ( N 2 ) elements, where N is the number

of sequences. Instead, n sequences are randomly selected and the

distances between the n sequences and the remaining sequences

are computed to classify the sequences into n groups. The n groups

are recursively subjected to the same process, to create a tree-like

classification. The time complexity of this processes is O ( N log N ).

There are several subtypes of the PartTree option. The fastest one is

2.6 MSA of a Large

Number of Sequences

in which distances are computed based on the number of shared

6mers. A more accurate subtype is also available.

in which distances are computed based on DP. The application of

DP to a large dataset might seem to be impractical, but as a result

of the PartTree algorithm, we can drastically restrict the number of

DP runs. Accordingly, this option is feasible and gives slightly

better accuracy than the 6mer-based option in our tests. See [ 39 ]

for details. The latest version of the Clustal series, Clustal Omega

[ 6 ], provides an alternative method for large MSA, using the mBed

algorithm [ 40 ].

Next Page

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home