Biology Reference
In-Depth Information
pair. To further improve the overall speed of this graph-based
alignment process, the alignment graph is pruned after each update
by removing any redundant edges, which makes the compatibility
verification step more efficient.
2.3 Mapping the
Graph to a Multiple
Sequence Alignment
Except for possibly a few nodes that may not have any priority to
each other in
, there is a one-to-one relationship between the final
alignment graph
G
and the multiple sequence alignment. To find
the final MSA, we only have to arrange the columns (represented as
nodes in
G
) such that the relative order of the corresponding nodes
in this linear arrangement does not conflict with that in the final
alignment graph
G
. This can be easily achieved by using a depth-
first-search algorithm to arrange the nodes in a linear directed path P ,
according to their topological ordering.
G
2.4 Improving the
Alignment Quality
in Low Confidence
Regions
The alignment quality of the regions that mainly consist of residue
pairs with low alignment probabilities can be further improved by
performing selective profile-profile alignments. Rather than taking
a random split and realignment strategy as in [ 21 ], which may break
the confidently aligned residue pairs that have high alignment
probabilities, PicXAA adopts an iterative refinement technique,
which first aligns each sequence with a set of highly similar
sequences in S , and then aligns the resulting sequence profile with
the profile that consists of the remaining sequences ( see Note 8 ). In
this way, PicXAA takes advantage of both the intra-family similarity
as well as the inter-family similarity, thereby improving the overall
quality of the MSA in low similarity regions without disrupting the
residue alignments in high confidence regions ( see Note 4 ).
A similar approach can be also used for the structural alignment of
noncoding RNAs (ncRNAs). Recently, PicXAA-R [ 23 ]hasextended
the basic idea of PicXAA by additionally incorporating RNA folding
information to predict accurate multiple RNA sequence alignments.
There is also aWeb-based platform called PicXAA-Web [ 24 ], which is
designed to integrate PicXAA and PicXAA-R in a user-friendly Web
environment for accurate alignment and analysis of multiple protein,
DNA, and RNA sequences. PicXAA-Web can be freely accessed at:
http://gsp.tamu.edu/picxaa
2.5 Other Relevant
Versions of PicXAA
3 Notes
1. Generally, PicXAA can be used with any estimation scheme for
computing the pairwise residue alignment probabilities. Cur-
rently, PicXAA allows the user to choose from three different
methods for computing the alignment probabilities: (a) the
pair-HMM approach implemented in ref. 10 , (b) the structural
pair-HMM approach used in ref. 15 , and (c) the partition
Search WWH ::




Custom Search