Biology Reference
In-Depth Information
tree reconstruction, we normally refer to sequence-based meth-
ods. In this case, the data consist of a multiple sequence alignment
and the preferred substitution model is a Markovian one. PAML, 23
PHYML, 24 and RAxML 25 are well-known implementations of the
sequence-based ML method.
The same substitution model used for sequence-based ML is
often used on pairs of sequences to estimate pairwise distances used
by LS tree methods. During this conversion, a reduction of infor-
mation takes place. As a consequence, LS methods are expected to
be statistically less efficient than ML methods, which use the
sequence information directly. The advantage of the distance
approach is that it is substantially faster, allowing fast reconstruction
of very large trees.
The tradeoff between speed and topological accuracy is also an algo-
rithmic issue. Greedy methods are faster than optimality-based ones, but
they generally produce inferior trees. In the next section, we will see how
greedy methods can be used to produce starting trees, which are then
improved using heuristics and an optimality criterion.
3. Tree Building with an Objective Function
In this section, we treat the problem of finding the best tree according to
some optimality criterion. We have seen in Sec. 1 that an exhaustive search
of the tree space is not possible, even for a small number of leaves. A gen-
eral strategy to solve difficult problems is to use a constructor-heuristic-
evaluator (CHE) approach. A CHE procedure constructs an initial
solution with the constructor, and then applies heuristics to improve it; the
result is then evaluated to decide whether it is retained or not. The follow-
ing is a high-level description of the CHE approach in pseudocode:
sol: = constructor(problem)
repeat
new: = heuristic(sol, problem)
if evaluator(new) > evaluator(sol) then
sol: = new
endif
until finish-criteria
Search WWH ::




Custom Search