Biology Reference
In-Depth Information
The evaluator scores a given tree with the method's optimality crite-
rion. Examples of evaluators are ||
|| 2 for WLS trees, L ( P ) for ML trees,
and the number of necessary changes for parsimony. We will not discuss
evaluators any further. In the following two subsections, we will discuss
the constructor part of the CHE and look at different kinds of heuristics
to search a tree space.
ε
3.1. Constructors
A constructor builds an initial tree from the input data and is therefore
itself a tree building method. It should be fast and provide a reasonable
start for the optimization part. The greedy clustering algorithms men-
tioned in the previous section have these properties. In principle, a com-
pletely random tree is also a constructor, but it is usually so poor that most
heuristics are unable to recover. A constructor having some randomness is
even a better idea because (a) better trees can be obtained by optimizing
from different starting points, and (b) different starting trees can be opti-
mized independently of each other and hence we can perform the time-
consuming optimization in parallel. Ideally, we should be able to control
the level of randomness of the constructor with a parameter, so that at one
end we have total randomness (poor trees, great diversity) and at the other
end we have deterministic trees (reasonably good ones, no diversity). One
possibility to achieve this goal is to randomize greedy algorithms.
We illustrate this tunable randomization on the example of UPGMA. We
will modify UPGMA so that at every joining step, instead of taking the near-
est neighbor, it makes a random choice. Let p be the randomness parameter,
where p
1 corresponds to determinism.
At each step, the choices are made with the following probabilities:
=
0 means total randomness and p
=
Pr {join nearest neighbors}
=
p
Pr {join 2nd nearest neighbors}
=
p (1
p )
p ) 2
Pr {join 3rd nearest neighbors}
=
p (1
...
if all fails, make a uniform random choice among all neighbors.
So, p
0 guarantees a purely random join at every step, which will pro-
duce a random tree; while p
=
=
1 corresponds to UPGMA. There is a clear
Search WWH ::




Custom Search