Biology Reference
In-Depth Information
2.
If T is a phylogenetic X -tree, for any set X with
|
X
|
4, and with any positive
edge-weighting
ω
and induced dissimilarity map D T
, show that the four-point
condition holds for T .
= (
,
),
Returning to an arbitrary labeled X -tree T
V
E
X
V with edge-weighting
ω :
R
E
, we have, by assumption, that there's no edge between a leaf and itself,
so d T (
X . This is consistent with notions of evolutionary
distance: there's zero dissimilarity between any sequence s x and itself. It is also
presumed that the information encoded by evolutionary distance measures on pairs
of sequences s x ,
x
,
x
) =
0 for any x
s y is independent of the ordering of the two (i.e., it does not matter
which is “first” or “second”). From Exercise 10.8 , one can see explicitly that the
evolutionary distances dJ and dH are dissimilarity maps.
We have seen that if one begins with an edge-weighted X -tree T ,usingthetree
and the weighting, one can use the edge labels to find a value d T (
x
,
y
)
for each
pair of leaves x
X , that has an interpretation as a “distance,” namely, the path
length between x and y along the tree T , if the “length” of an edge e is taken to be its
edge weight
,
y
) ] x , y X is a
dissimilarity map. Again, the fundamental biological problem of phylogenetics is to
start the other way around—from a relevant set X (species, genes, etc. or sequences
standing in for the species, genes, and so on) and a collection
ω(
e
)
. The resulting matrix of all values D T =[
d T (
x
,
y
{
d
(
x
,
y
) x , y X }
of
relevant “distances,” find a tree T and an edge-weighting
ω
for which the natural
dissimilarity map D T fits the data D
) ] x , y X “well.” This is what is meant
by “reconstructing a phylogenetic tree” from the given data, using a distance-based
approach. In themost ideal case of “fittingwell,” D T fits D exactly, that is, they agree
as functions: d T (
=[
d
(
x
,
y
x
,
y
) =
d
(
x
,
y
)
for all x
,
y
X . If one begins with a phylogenetic
X -tree T , and a nonnegative edge-weighting
ω
for T , and takes as data D by setting
D T , then trivially D T fits D exactly. This raises the issue of consistency in
tree reconstruction methods, that is, whether a tree reconstruction method applied to
data D that is derived from a tree T (perhaps weighted, with weight
D
=
ω
) actually outputs
this tree T (and the corresponding weights
). One can also speak more generally of
statistical consistency of a tree reconstruction method, namely, the probability that
the method outputs the correct tree given sufficient data about the input.
Exercise 10.14.
ω
1. For the quartet tree T with cherry
{
x
,w }
having parent node u and cherry
{
y
,
z
}
having parent node v , and for the representative sequences on the leaves X
=
{
,
,w,
}
=
ω
x
y
z
below, if D
dH , can you find an edge-weighting
for T so that
D T =
D ?
s w =
GATTTCCTTC
,
s x =
GACATACTTC
,
s Y
=
GATTACATTC
,
s z =
GATTAAACTTC
.
2.
In the basic parsimony method of tree reconstruction, cherries are selected
by linking nodes for which d H is minimal. For the same quartet tree T with
 
Search WWH ::




Custom Search