Biology Reference
In-Depth Information
This chapter focuses on evolutionary sequence alignment and
the use of a phylogeny-aware alignment algorithm to infer align-
ments for evolutionary studies. The definition of evolutionary
homology is central and we will start by discussing that and its
correct representation in multiple sequence alignments. We will
then introduce—with lots of figures and no equations—the
phylogeny-aware alignment algorithm implemented in PRANK .
After detailing the strengths and weaknesses of the method, we
will see what this means in practice and give advice for the use of
PRANK . We will finish with a brief discussion on the future plans for
PRANK and related methods.
In the following sections, some methods based on the classical
progressive algorithm are criticized and shown to perform poorly.
This criticism is based on their performance in evolutionary ana-
lyses only and, as demonstrated in other chapters of this topic, the
alignments they produce can be suitable for other types of analyses.
Similarly, the phylogeny-aware algorithm may perform poorly in
non-evolutionary alignment tasks and alternative methods should
be used.
2
Evolutionary Homology in Sequence Alignment
A multiple sequence alignment represents sitewise homology
among the characters in different sequences. The type of homology
denoted by the alignment depends on the application and the
intended use of the data: when the alignment is used for evolution-
ary analyses, the characters placed in the same column are believed
to be evolutionarily homologous and share a common ancestor.
Evolutionary homology is not the same as structural homology and
the difference between the two homology types is most clearly
evidenced by the role of insertions. Two independent insertions
at the same position can lead to identical changes in the structure
and the characters inserted independently may thus be considered
structurally homologous; in contrast, independent insertions—
even at exactly the same position—do not share a common ancestor
and can never be evolutionarily homologous. To correctly indicate
the evolutionary homology of insertions, the characters descending
from different insertions events should be placed in separate align-
ment columns (Fig. 1 ).
If one restricts the analysis to relatively short sequence frag-
ments, one can assume that the sequences evolve by substitutions,
insertions, and deletions only. The three processes can be assumed
to occur at relatively constant (although for different processes
distinct) rates, substitutions typically being at least an order of
magnitude more common than insertions and deletions [ 2 ]. The
three processes differ greatly in their effect on the sequences and on
one's ability to infer the events from the data: (1) a character at a
Search WWH ::




Custom Search