Biology Reference
In-Depth Information
species, it is 10,395 trees. Maximum-likelihood methods provide consistent esti-
mates of branch lengths, indicating that the estimates approach the true values
as the amount of data increases. To estimate the likelihood that a particular tree
estimate is the true tree, bootstrapping techniques can be used. Bootstrapping
involves repeated sampling, with replacement, of artificial data sets to produce
an estimate of the variance. The name of this statistical method was derived
from the term “pull one's self up by your bootstraps,” and the method allows
statistical distributions to be generated from very little data.
Methods for analyzing molecular data are still undergoing development. The
immense amount of DNA sequence data that is becoming available makes it dif-
ficult to use maximum-likelihood methods unless very powerful computers are
used. Maximum-likelihood algorithms have been developed to build trees from
pairwise distances, but they use only a summary of the data and information is
thus lost. Parsimony methods are fast, but may be appropriate only for very slow
rates of evolutionary change.
Another approach to analyzing evolutionary processes and phylogeny is
Bayesian inference ( Shoemaker et  al. 1999, Huelsenbeck et  al. 2001, Ronquist
and Deans 2010, Fan and Kubatko 2011 ). Bayesian inference uses the same mod-
els of evolution as other methods and can be used to infer phylogeny, evaluate
uncertainty in phylogenies, detect selection, compare trees, evaluate divergence
times, and test the molecular clock ( Huelsenbeck et al. 2001 ). Bayesian inference
of phylogeny is based on a quantity called the “posterior probability of a tree”
and uses Bayes's theorem:
Pr[DataTree]
|
Pr[Tree]
Pr[TreeData]
|
=
Pr[Data]
In this theorem the vertical bar should be read as “given” and is used to
“combine the prior probability of a phylogeny (Pr[Tree]) with the likelihood
(Pr[Data I Tree]) to produce a posterior-probability distribution on trees (Pr[Tree I
Data]). The posterior probability of a tree is the probability that the tree is correct.
Inferences about the history of the group are then based on the posterior prob-
ability of trees and the tree with the highest posterior probability might be chosen
as the best estimate of phylogeny” ( Huelsenbeck et al. 2001 ). The likelihood is cal-
culated using one of a number of standard Markov models of character evolution.
A Markov process is a mathematical model of infrequent changes of discrete states
(nucleotides or amino acids) over time, in which future events occur by chance.
Phylogenetic analysis can be difficult because a large number of trees poten-
tially could describe the relationships of a group of species. Evaluating which of
 
Search WWH ::




Custom Search