Biology Reference
In-Depth Information
R + 5 are subsets of -dimensional space, rather
than 3-dimensional space.) In more mathematical terms,
many points in time to consider, and the
T 4 lives nicely in a product
5 , and looks like three copies of (the positive orthants of)
5 .
T 4 × R
vector space
R
T n × R + s ,of
, T n is the product set
(
) !!
More generally, for n
3
2 n
5
many copies
s , where
of
| T n |= (
2 n
5
) !!
and s
=|
E
|=
2 n
3, both as per Exercise 10.4 .
R
T n is handled somewhat differently by differ-
ent authors, to take advantage of simplifications in representing trees (and to use
rooted trees in some cases, unrooted in others). For instance, if one extends to include
nonnegative edge-weightings (that is, some values
The treatment of the tree space
can be zero), then one may
include positively weighted, (unrooted) nonbinary trees on n leaves as part of tree
space, and then compare how edge-weighted binary n -trees are related to one another
through the collapsing of one or more edges. For an original definition of tree space
and helpful pictures of this sort in the rooted tree case, as well as a discussion of how
the positive orthants are “glued” together along spaces corresponding to (nonbinary)
trees with collapsed edges, see [ 18 ]; for another analogous but different perspective
on the unrooted case, also with nice pictures, see [ 19 ]. Using another variation on tree
space, there has been a lot of work done to examine distances, particularly geodesics,
in tree space; see [ 18 , 20 , 21 ]. Regardless of the precise formulation of tree space,
as the example and brief treatment above may geometrically suggest, it can be use-
ful to group points
ω(
e
)
by their phylogenetic trees T or even by their underlying
topologies. This is, again, the perspective that for each T
(
T
,ω)
T n there is the “same”
s attached to it, so to determine a point in
copy of
R
T n , it is often useful to focus on
determining just the tree T
T n alone, and then think of the associated weighting
ω
, even though the two are linked. (Again, this is analogous to thinking of a point in
our everyday four-dimensional space-time lives first by its time coordinate, then as a
point in space at that time.)
In any case, for practical purposes, however, the notion of standard distance in
m may not be the best distance measure for comparing trees and fitting trees to
dissimilarity maps. In fact, it is shown in [ 22 ] that taking a least-squares approach
to picking an edge-weighted tree T with weighting
R
ω
and induced metric D T =
[
d T ,ω( i , j ) ]
to best fit a given dissimilarity map D
=[
d
(
i
,
j
) ]
is an NP-complete
problem. The same holds true for minimizing
1 i < j n |
d
(
i
,
j
)
d T (
i
,
j
) |
(the
f-statistic ) rather than the corresponding sum of squares.
In the search for heuristics to carry out tree reconstruction, many fitting algo-
rithms proceed instead by wandering among the trees (points) in tree space, and
rather than use a least-squares or f -statistic approach, use metrics that record points
v T ,v T
T
as being “close” when their trees T
,
T n (edge-weighted by
ω
,
ω ) can be transformed into one another explicitly by means of a small num-
ber of iterations of a few well-known operations on trees as graphs. In the case of
Robinson-Foulds (RF) topological distance, the two relevant operations are a form of
merging nodes and then splitting nodes on the underlying tree topologies (ignoring
weightings), as illustrated in the picture below for two trees of RF-distance apart (see
Figure 10.5 ).
resp.,
 
Search WWH ::




Custom Search