Biology Reference
In-Depth Information
d.
How many distinct (unrooted) binary phylogenetic
X
-trees are there in
T
n
,if
X
=[
n
]
,for
n
=
3
,
4
,
5
,
6?
10.3
TREE SPACE
10.3.1
Trees as Points
In the last section, we discussed trees, phylogenetic
X
-trees (and more so, binary ones)
in particular, as graphs, and developed a bit of intuition supporting their interpretation
in some biological situations. In this section, we move beyond the visual appeal of
trees, as graphical models, in order to interpret them in a geometric framework that
better sets the stage for implementing algorithmic and computational methods to
reconstruct the phylogeny of a set of species (or genes, or other organisms) and for
describing geometrically the advantages and pitfalls of those methods.
By definition, phylogenetic
X
-trees have leaves that are labeled, but for phylo-
genetic tree reconstruction, one may also want to label the edges. Thinking of the
vertices
V
, of a phylogenetic
X
-tree
T
, as species for the moment, then
labels on the edges can, depending on the biological context, be regarded as providing
some attribute of the sequences, often a measure of evolutionary change from one
species to the next. In graph-theoretic terms, a labeling of the edges
E
of
T
is an
edge-weighting
, a function
=
(
V
,
E
)
ω
:
E
→
R
that assigns a real number
ω(
e
)
,say,toevery
edge
e
areassumedtotake
nonnegative values, but for phylogenetic tree reconstruction algorithms, it is useful
to allow for more general edge-weightings. In phylogenetics, the graphical notion of
an edge-weighting corresponds to an
evolutionary distance map
. Determination of
evolutionary distances is a process anchored in the choice of a so-called
model of evo-
lution
, which describes how (and how frequently or withwhat probabilities) sequences
change, e.g., via substitutions, insertions, or deletions in their character strings. This is
a very large and significant area of research for which there are a number of nice treat-
ments in both biology and biomathematics texts; see the references in the Introduction.
We also do not discuss in any detail here the many ways by which evolutionary dis-
tances are defined and determined, we do give two examples in Exercise
10.8
below.
Trees
T
∈
E
. In many cases, such edge-weightings
ω
:
E
→
R
, will be the outcomes of the distance-
based reconstruction methods we examine.
We l e t
∈
T
n
, together with weightings
ω
={
(
T
,ω)
|
T
=
(
V
,
E
)
∈
T
n
→
R
+
}
T
n
,ω
:
denote the set of all ordered pairs of unrooted binary phylogenetic
X-trees T together with positive edge weightings
E
on T
. Some distance-based recon-
struction methods may produce edge weightings with negative values or zero, so it
may be useful to extend
ω
T
n
to include these weightings.
Exercise 10.8.
1.
The
Hamming distance
,
dH
, is defined to be the number of characters at
which two sequences
x
and
y
differ. Suppose we have two sequences
x
(
x
,
y
)
,
y
over
Search WWH ::
Custom Search