PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

pair. To further improve the overall speed of this graph-based

alignment process, the alignment graph is pruned after each update

by removing any redundant edges, which makes the compatibility

verification step more efficient.

2.3 Mapping the

Graph to a Multiple

Sequence Alignment

Except for possibly a few nodes that may not have any priority to

each other in

, there is a one-to-one relationship between the final

alignment graph

G

and the multiple sequence alignment. To find

the final MSA, we only have to arrange the columns (represented as

nodes in

G

) such that the relative order of the corresponding nodes

in this linear arrangement does not conflict with that in the final

alignment graph

G

. This can be easily achieved by using a depth-

first-search algorithm to arrange the nodes in a linear directed path P ,

according to their topological ordering.

G

2.4 Improving the

Alignment Quality

in Low Confidence

Regions

The alignment quality of the regions that mainly consist of residue

pairs with low alignment probabilities can be further improved by

performing selective profile-profile alignments. Rather than taking

a random split and realignment strategy as in [ 21 ], which may break

the confidently aligned residue pairs that have high alignment

probabilities, PicXAA adopts an iterative refinement technique,

which first aligns each sequence with a set of highly similar

sequences in S , and then aligns the resulting sequence profile with

the profile that consists of the remaining sequences ( see Note 8 ). In

this way, PicXAA takes advantage of both the intra-family similarity

as well as the inter-family similarity, thereby improving the overall

quality of the MSA in low similarity regions without disrupting the

residue alignments in high confidence regions ( see Note 4 ).

A similar approach can be also used for the structural alignment of

noncoding RNAs (ncRNAs). Recently, PicXAA-R [ 23 ]hasextended

the basic idea of PicXAA by additionally incorporating RNA folding

information to predict accurate multiple RNA sequence alignments.

There is also aWeb-based platform called PicXAA-Web [ 24 ], which is

designed to integrate PicXAA and PicXAA-R in a user-friendly Web

environment for accurate alignment and analysis of multiple protein,

DNA, and RNA sequences. PicXAA-Web can be freely accessed at:

http://gsp.tamu.edu/picxaa

2.5 Other Relevant

Versions of PicXAA

3 Notes

1. Generally, PicXAA can be used with any estimation scheme for

computing the pairwise residue alignment probabilities. Cur-

rently, PicXAA allows the user to choose from three different

methods for computing the alignment probabilities: (a) the

pair-HMM approach implemented in ref. 10 , (b) the structural

pair-HMM approach used in ref. 15 , and (c) the partition

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home