Biology Reference
In-Depth Information
2.
If
T
is a phylogenetic
X
-tree, for any set
X
with
|
X
|
4, and with any positive
edge-weighting
ω
and induced dissimilarity map
D
T
,ω
, show that the four-point
condition holds for
T
.
=
(
,
),
⊂
Returning to an arbitrary labeled
X
-tree
T
V
E
X
V
with edge-weighting
ω
:
→
R
E
, we have, by assumption, that there's no edge between a leaf and itself,
so
d
T
,ω
(
X
. This is consistent with notions of evolutionary
distance: there's zero dissimilarity between any sequence
s
x
and itself. It is also
presumed that the information encoded by evolutionary distance measures on pairs
of sequences
s
x
,
x
,
x
)
=
0 for any
x
∈
s
y
is independent of the ordering of the two (i.e., it does not matter
which is “first” or “second”). From Exercise
10.8
, one can see explicitly that the
evolutionary distances
dJ
and
dH
are dissimilarity maps.
We have seen that if one begins with an edge-weighted
X
-tree
T
,usingthetree
and the weighting, one can use the edge labels to find a value
d
T
,ω
(
x
,
y
)
for each
pair of leaves
x
X
, that has an interpretation as a “distance,” namely, the path
length between
x
and
y
along the tree
T
, if the “length” of an edge
e
is taken to be its
edge weight
,
y
∈
)
]
x
,
y
∈
X
is a
dissimilarity map. Again, the fundamental biological problem of phylogenetics is to
start the other way around—from a relevant set
X
(species, genes, etc. or sequences
standing in for the species, genes, and so on) and a collection
ω(
e
)
. The resulting matrix of all values
D
T
,ω
=[
d
T
,ω
(
x
,
y
{
d
(
x
,
y
)
x
,
y
∈
X
}
of
relevant “distances,” find a tree
T
and an edge-weighting
ω
for which the natural
dissimilarity map
D
T
,ω
fits the data
D
)
]
x
,
y
∈
X
“well.” This is what is meant
by “reconstructing a phylogenetic tree” from the given data, using a distance-based
approach. In themost ideal case of “fittingwell,”
D
T
,ω
fits
D
exactly, that is, they agree
as functions:
d
T
,ω
(
=[
d
(
x
,
y
x
,
y
)
=
d
(
x
,
y
)
for all
x
,
y
∈
X
. If one begins with a phylogenetic
X
-tree
T
, and a nonnegative edge-weighting
ω
for
T
, and takes as data
D
by setting
D
T
,ω
, then trivially
D
T
,ω
fits
D
exactly. This raises the issue of
consistency
in
tree reconstruction methods, that is, whether a tree reconstruction method applied to
data
D
that is derived from a tree
T
(perhaps weighted, with weight
D
=
ω
) actually outputs
this tree
T
(and the corresponding weights
). One can also speak more generally of
statistical consistency
of a tree reconstruction method, namely, the probability that
the method outputs the correct tree given sufficient data about the input.
Exercise 10.14.
ω
1.
For the quartet tree
T
with cherry
{
x
,w
}
having parent node
u
and cherry
{
y
,
z
}
having parent node
v
, and for the representative sequences on the leaves
X
=
{
,
,w,
}
=
ω
x
y
z
below, if
D
dH
, can you find an edge-weighting
for
T
so that
D
T
,ω
=
D
?
s
w
=
GATTTCCTTC
,
s
x
=
GACATACTTC
,
s
Y
=
GATTACATTC
,
s
z
=
GATTAAACTTC
.
2.
In the basic parsimony method of tree reconstruction, cherries are selected
by linking nodes for which
d
H
is minimal. For the same quartet tree
T
with
Search WWH ::
Custom Search