Biology Reference
In-Depth Information
works very well and can reconstruct alignments with lengths very
close to the true length (Fig. 2 ). When the underlying assumptions
hold, the method in principle scales up to any number of sequences.
The phylogeny-aware algorithm reconstructs ancestral
sequences with information about sites that are believed to be
insertions. The ancestral sequences are required for the alignment
but they can be useful otherwise, too: PRANK allows for outputting
inferred ancestral sequences, using gaps to indicate sites that are
believed to have been later inserted and not present in the ances-
tors, along with the alignment of the extant sequences. Such align-
ments are unique and enable studying the process of change and
timing certain events to specific evolutionary branches. In addition
to ancestral sequences, the algorithm also infers the type of
mutation events that have caused the length differences between
the sequences and can provide this information in the output.
Although an experienced user may distinguish insertions and dele-
tions from the gap patterns they create, the marking of gaps caused
by insertions and deletions with different symbols, as can be done
with PRANK , is helpful.
The explanation and illustration of the flagging approach used
by PRANK is slightly simplified and only considers one level of
flagging. In practice, the algorithm marks the gaps in the immedi-
ately preceding alignments and, for the sites not cleared of flags, for
the one before that. This procedure prevents long deletions in one
branch from masking overlapping insertions in the descendants of
its sister branch. For details, see [ 5 , 6 ].
4
Limitations of the Phylogeny-Aware Algorithm
Unlike typical progressive alignment algorithms, the phylogeny-
aware algorithm does not align sub-alignments to each other but
reconstructs ancestral sequences to represent the parents of sets of
aligned descendant sequences and then aligns pairwise these ances-
tral sequences. Accurate representation of the ancestral sequences,
including the detection of inserted and deleted sites, is required for
the correct distinction between insertion and deletion events in the
subsequent stages of alignment. Correct reconstruction of
sequences naturally requires that such ancestral sequences really
existed and were true ancestors for the given sets of descendant
sequences.
As the ancestral sequences are reconstructed for the internal
nodes of the alignment phylogeny, it is crucial that the phylogeny
accurately reflects the evolutionary history of the sequences. The
role of alignment phylogeny is especially central in the calling of
permanent insertions ( PRANK +F ) that considers the re-use of a
flagged gap as a confirmation that the gap has been created by an
insertion. With the wrong order of aligning the sequences,
Search WWH ::




Custom Search