Biology Reference
In-Depth Information
Exploring the solution space . One of the appealing aspects of SAT´ is
that it provide opportunities for exploration of the set of align-
ments and trees that are returned during the SAT ´ run, which can
allow you to explore how alignments impact the tree estimation,
among other things. This is particularly useful on small datasets
because each iteration can be done quickly, and so many iterations
can be run on small datasets. To enable this exploration, we recom-
mend setting the stopping rule to an iteration limit, and setting that
limit to a large number (how large, of course, depends upon how
much time you wish to devote). There are many methods for
exploring sets of trees [ 60 - 64 ], each aimed at extracting different
types of information. Similar analyses for exploring sets of align-
ments are not yet in standard use, but pairs of alignments are often
compared to determine common homologies [ 65 ].
Multi-locus datasets . Often the objective is the estimation of a species
tree from a set of different genes, each of which involves an align-
ment and tree estimation. You have several options for how to do a
multi-locus analysis, depending on whether you are concerned
about the potential for gene trees to be different from the species
tree. That is, true gene trees can differ from the true species tree due
to biological processes such as incomplete lineage sorting, gene
duplication and loss, and horizontal gene transfer [ 66 ]. Therefore,
the choice of how to estimate the species tree from a set of estimated
gene trees can take some care. If you have concerns about potential
conflict between gene trees, you can run SAT ´ on each marker
separately, thus producing independently estimated gene trees and
alignments for each gene, and these estimated trees and alignments
can then be used to estimate a species tree using techniques that are
specifically designed to combine estimated gene trees into a species
tree. See [ 67 - 74 ] and references therein for an introduction to
methods that can estimate phylogenetic trees and networks in the
presence of these processes that cause gene tree incongruence. If
you are not concerned about potential gene tree conflict, we recom-
mend using SAT ´ in its default setting for multi-locus datasets. This
analysis operates by concatenating the datasets together, and then
uses the standard iterative divide-and-conquer strategy to produce
alignments of each locus and a tree on the entire dataset.
General advice . We recommend that you back up your files
( see Note 8 ) for all SAT´ analyses. This is generally a good practice,
but especially for large dataset analyses or when you wish to explore
the solution space, which can take a substantial amount of time to
run. Some analyses may benefit from the use of archival systems ( see
Note 9 ), especially if your analyses involve very large datasets that
you plan to explore in multiple ways.
Search WWH ::




Custom Search