Biology Reference
In-Depth Information
“sequence.fasta”, the PartTree algorithm can be invoked using
the following command: mafft -parttree -retree 2 -partsize 1000
sequence.fasta
startingAlignment.fasta. The command to run
Clustal Omega is: clustalo -auto -dealign -i sequence.fasta
>
>
startingAlignment.fasta. Once you have the alignment, you can
provide this to SAT´ as the initial alignment (see above).
4. In the “External Tools” window, choose the following software
settings: “MAFFT” for the “Aligner” dropbox, “Muscle” for
the “Merger” dropbox, and “FastTree” for the “Tree Estima-
tor” dropbox. For nucleotide analyses, select “GTR + CAT”
for the “Model” dropbox, and for protein analyses, select
JTT + CAT.
5. In the “Sequences and Tree” window, provide your initial
alignment (if available), and click on “initial alignment
(use for initial tree)”. Follow from step 3 in Subheading 8.6 .
6. In Workflow Settings, do not select “Extra RAxML Search”,
unless your dataset is not particularly big-the final RAxML
search could be the most computationally intensive part of
your analysis, and may not provide substantial benefits.
7. In the “Job Settings” window, make sure you provide the
number of CPU(s) available (this will have a large impact on
the running time, if more than 1 CPU can be used in the
analysis). Also make sure that the “Max. Memory (MB)” dialog
specifies the correct amount of available memory, since mem-
ory limitations are often a problem that cause running times to
increase. See Note 7 .
8. In the “SAT´ settings” window, you can use Quick Set to select
“SAT´-II-fast”; this will set all the settings appropriately. Alter-
natively, you can modify the settings as follows. Select the
“Size” radio button in the “Max. Subproblem” field and a
size of 200 in the dropdown menu. Set the decomposition to
“centroid” (because using “Longest” will not only slow down
the analysis, but also should only be run with Opal, and Opal
should not be run with large datasets). Set the “Apply Stop
Rule” to either “After Launch” (for very large datasets) or to
“After Last Improvement”. Do not select “Blind Mode
Enabled” if your dataset is very large. It is also probably not a
good idea to use a time limit for the stopping rule if your
dataset is very large, since it is possible for a single iteration to
not complete in the time you pick. Therefore, we recommend
instead picking an iteration limit. The number of iterations you
pick should depend on your dataset, but for very large datasets,
it may be best to have a small number (say, 2) of iterations.
If these complete quickly, you can always use the output align-
ment and tree to initialize another SAT´ run! We recommend
setting “Return” to “Best”.
Search WWH ::




Custom Search