Biology Reference
In-Depth Information
d. How many distinct (unrooted) binary phylogenetic X -trees are there in
T n ,if
X
=[
n
]
,for n
=
3
,
4
,
5
,
6?
10.3 TREE SPACE
10.3.1 Trees as Points
In the last section, we discussed trees, phylogenetic X -trees (and more so, binary ones)
in particular, as graphs, and developed a bit of intuition supporting their interpretation
in some biological situations. In this section, we move beyond the visual appeal of
trees, as graphical models, in order to interpret them in a geometric framework that
better sets the stage for implementing algorithmic and computational methods to
reconstruct the phylogeny of a set of species (or genes, or other organisms) and for
describing geometrically the advantages and pitfalls of those methods.
By definition, phylogenetic X -trees have leaves that are labeled, but for phylo-
genetic tree reconstruction, one may also want to label the edges. Thinking of the
vertices V , of a phylogenetic X -tree T
, as species for the moment, then
labels on the edges can, depending on the biological context, be regarded as providing
some attribute of the sequences, often a measure of evolutionary change from one
species to the next. In graph-theoretic terms, a labeling of the edges E of T is an
edge-weighting , a function
= (
V
,
E
)
ω :
E
R
that assigns a real number
ω(
e
)
,say,toevery
edge e
areassumedtotake
nonnegative values, but for phylogenetic tree reconstruction algorithms, it is useful
to allow for more general edge-weightings. In phylogenetics, the graphical notion of
an edge-weighting corresponds to an evolutionary distance map . Determination of
evolutionary distances is a process anchored in the choice of a so-called model of evo-
lution , which describes how (and how frequently or withwhat probabilities) sequences
change, e.g., via substitutions, insertions, or deletions in their character strings. This is
a very large and significant area of research for which there are a number of nice treat-
ments in both biology and biomathematics texts; see the references in the Introduction.
We also do not discuss in any detail here the many ways by which evolutionary dis-
tances are defined and determined, we do give two examples in Exercise 10.8 below.
Trees T
E . In many cases, such edge-weightings
ω :
E
R
, will be the outcomes of the distance-
based reconstruction methods we examine. We l e t
T n , together with weightings
ω
={ (
T
,ω) |
T
= (
V
,
E
)
T n
R + }
T n :
denote the set of all ordered pairs of unrooted binary phylogenetic
X-trees T together with positive edge weightings
E
on T . Some distance-based recon-
struction methods may produce edge weightings with negative values or zero, so it
may be useful to extend
ω
T n to include these weightings.
Exercise 10.8.
1. The Hamming distance , dH
, is defined to be the number of characters at
which two sequences x and y differ. Suppose we have two sequences x
(
x
,
y
)
,
y over
 
Search WWH ::




Custom Search