Biology Reference
In-Depth Information
analytical techniques result in an unrooted tree or unrooted phylogeny, one in
which the earliest point in time is unidentified ( Figure 12.3 ). In molecular phylog-
enies, branch length is the average number of nucleotide substitutions per site. If a
branch length is 0.2 then, on average, the site has undergone 0.2 changes. Because
a nucleotide changes or it doesn't, this average is based on 0 or 1 change.
Molecular data used to construct trees are either discrete characters or simi-
larities ( distances ). Examples of discrete molecular characters include DNA
sequences, allozyme frequencies, or restriction-map data. Most methods assume
independence and homology among discrete characters. Distance data specify a
relationship between pairs of taxa or molecules. Sequence, restriction map, and
allozyme data must be transformed to produce distance data. Once data have
been gathered and transformed into appropriate values there are four broad cat-
egories of methods to estimate phylogeny. These include distance-matrix meth-
ods, maximum-parsimony methods, and maximum-likelihood methods, which are
discussed in detail by Swofford and Olson (1990) , Weir (1990) , Hillis et al. (1996) ,
Nei (1996) , Huelsenbeck and Rannala (1997) , Steel and Penny (2000) , Whelan
et  al. (2001), and Hall (2011) . A more recent addition to phylogenetic analysis
involves Bayesian inference ( Shoemaker et al. 1999, Huelsenbeck et al. 2001 ).
Distance-matrix methods are based on the set of distances calculated
between each pair of species and this is the oldest family of phylogenetic recon-
struction methods. The computations are relatively simple and the quality of the
resulting tree depends on the quality of the distance measure. Using distances
to group the taxonomic units into a phenetic grouping usually uses clustering.
Several methods of clustering can be used, but the most widely used is called
Unweighted Pair-Group Method using an Arithmetic Average ( UPGMA ). It
defines the intercluster distance as the average of all the pairwise distances for
members of two clusters. The results of the clustering can be presented in a den-
drogram, in which the branch points are placed midway between two sequences
or clusters. The distance between a pair of sequences is the sum of the branch
lengths. The UPGMA method often is used for distance matrices, and it gener-
ally performs well when the mutation rates are the same along all branches
of the tree. However, the assumption of nearly equal mutation rates (or that a
molecular clock is operating) is crucial for the UPGMA method.
For situations in which the assumptions of the molecular clock are inappro-
priate, the Fitch-Margoliash algorithm can be used ( Weir 1990 ). If information
for an out-group is available, the resultant tree can be rooted. The Fitch and
Margoliash method allows for the possibility that the tree found is incorrect and
recommends that other trees be compared based on a measure of goodness
Search WWH ::




Custom Search