Biology Reference
In-Depth Information
these trees is the best approximation of the “true” tree can be difficult when
rates of DNA substitution are high; multiple substitutions at a site can make it
difficult to resolve true relationships, producing the “wrong tree.” Methods that
explicitly deal with multiple substitutions can overcome the statistical problems,
but the most powerful methods (maximum likelihood) can be used only on rela-
tively small data sets and many of the faster methods do not take advantage of
all the data contained in the DNA sequences.
Bayesian inference makes it possible to analyze large data sets more easily.
Instead of searching for the optimal tree, trees are sampled according to their
posterior probabilities. Once such a sample is available, features that are common
among these trees can be discerned and a consensus tree can be constructed.
“This is roughly equivalent to performing a maximum likelihood analysis with
bootstrap resampling, but much faster” ( Huelsenbeck et al. 2001 ).
Shoemaker et al. (1999) noted a “common criticism of the Bayesian approach
is that the choice of the prior distribution is too subjective.” Thus, researchers
using the same data could reach different conclusions if they used different
prior distributions. Furthermore, implementation of Bayesian methods can be
“very complex.” Bayesian methods may be especially useful for analyzing com-
plex evolutionary models (including horizontal gene transfer), and accommodat-
ing phylogenetic uncertainty.
Most analyses are conducted using at least two methods (maximum likelihood
and Bayesian) and if the results obtained are similar the tree estimates are con-
sidered adequate.
12.6.8 Artifacts
Inaccuracies in trees may occur for a variety of reasons ( Adoutte et  al. 2000 ).
Alignments of corresponding sequences must be carried out carefully. If unam-
biguous alignments of sequences cannot be obtained, different relationships
may be estimated. Poor alignments may result in a lack of strong statistical sup-
port for a particular tree. Another factor that affects phylogenies is the species
chosen to represent each group. Use of different species within a group can
result in different trees. Increasing the number of species analyzed may resolve
this problem, but the increased number of species increases the computational
time required to find the best tree to represent the relationships. For example,
if five species are studied there are just 15 possible unrooted trees, but if 50 spe-
cies are included, there are 3 × 10 74 potential trees to analyze.
Other issues include determining whether sequences are orthologous, paral-
ogous, or xenologous. An ortholog is a homologous sequence produced by
Search WWH ::




Custom Search