Biology Reference
In-Depth Information
4.5.3. Heuristic methods in tree reconstruction
The branch-and-bound algorithm is able to increase the practical maxi-
mum number of sequences or taxons up to 20 (depending on the degree
of homoplasy in the data). Once this threshold is passed, heuristic meth-
ods become necessary, as was the case with the multiple-alignment
programs mentioned in Sec. 3.2. Otherwise, depending on the complexity
of the data (the degree of homoplasy, for instance), CMP, CC, and MLE
(maximum likelihood estimation) methods can be impossibly slow in
providing a globally optimal solution.
Heuristic methods do not search the entire search space for a global
optimum, but nevertheless attempt to find the globally best tree. They
must be able to search through parts of the solution space efficiently.
Generally, the heuristics consist in searching through the search space,
starting from random locations and continuing in a direction where bet-
ter results are found. Once a local optimum is reached, results become
worse in all directions.
The methods used range from simple hill-climbing optimization
approaches to the complex MCMC and (MC) 3 algorithms. Creating a
decision tree as for the branch-and-bound algorithm is no longer possi-
ble; therefore, tree rearrangement techniques are used to modify a given
tree in the process of optimizing it, including when a local optimum has
been found.
It must be stressed that the programs return the best result found,
but have no way of knowing whether they have reached the global opti-
mum or not. Thus, all heuristic approaches are somewhat sensitive to
local optima and no guarantee can be given that the best tree is indeed
found. This problem is common to all nonexhaustive combinatorial
search algorithms, whether they are applied to phylogenetics or not.
4.5.4. A rapid maximum likelihood method:RAxML
RAxML is one of the fastest probabilistic methods available to date.
It uses a combination of specially designed heuristics and simple program
optimization. More specifically, it entails 12
efficient storage of intermediate solutions (topologies and branch
lengths) using rearrangement operators;
Search WWH ::




Custom Search