Biology Reference
In-Depth Information
for building trees is to run the kernel optimization over 50 trees gener-
ated by the randomized WPGMA, and then apply the RBFS heuristic
recursively over the best tree.
4.2. Time Comparison with PAUP
We have compared the running times of LST with the corresponding
function
hsearch
contained in PAUP* 4.0 beta 10 on simulated data.
4.2.1. Methods
The input problems consisted of pairwise distances from trees with
n
{500, 1000} leaves. The distances were obtained in the following
way: We generated a random tree with uniformly distributed U(0.1..
l
max
)
point accepted mutation (PAM) branch lengths. The maximum branch
length
l
max
is a parameter in the simulation that controls the difficulty of
the problems. A random sequence of 500 amino acids was generated and
mutated along the random tree. We assumed a Markovian model of evo-
lution using PAM matrices and introduced gaps of Zipfian distributed
length.
28
From the sequences at the leaves of the tree, we computed
pairwise distance estimates by ML.
The distances were then fed to LST and
hsearch
. We chose to use the
default parameters for
hsearch
and to impose a maximal running time of
50 hours (by setting the parameter time limit
∈
180 000). LST was run
in two modes. In the first mode, the parameter Trials was set to one,
which constitutes the lowest possible optimization level. In the second
mode, we set Trials
=
50 and used the RBFS heuristic.
The simulations were run on an AMD Athlon, 1800 MHz processor
running a Red Hat Linux 2.4.21. The times reported in the results are
user times measured with the time command included in Linux.
=
4.2.2. Results
Table 1 gives the results for (a) 500 leaves and (b) 1000 leaves. A row
corresponds to a particular
l
max
. It shows the average running time over
20 input problems, how often LST and
hsearch
returned the same tree,