Biology Reference
In-Depth Information
Table 6
FASTA command-line options
FASTA option
BLAST option
-b
:
high scores reported (limited by -E by default);
-num_descriptions
-d
:
number of alignments shown (limited by -E by
default)
-num_alignments
-e
:
expand_script to extend hits
-E
:
[10,1] E()-value,E()-repeat threshold
-evalue
-f
:
[-10] gap-open penalty
-gapopen
-F
:
[0] min E()-value displayed
-g
:
[-2] gap-extension penalty
-gapextend
help—show options, arguments
-h
-h
-m
:
[0] output/alignment format;
-outfmt
-M
:
filter on library sequence length
-O
:
write results to file
-out
protein/nucleotide query
-p/-n
blastp/blastn
-r
:
[+5/-4] +match/-mismatch for DNA/RNA
-reward
/
-penalty
-s
:
[BL50] Scoring matrix: (protein) BL50, BL62,
PAM250, OPT5, VT160, VT120, BL80; VT80,
VT40, VT20, VT10; scoring matrix file name;
-matrix
?BL50 adjusts matrix for short queries;
-s
filter lowercase (seg) residues
-S
-lcase_masking
-T
:
max threads/workers
-num_threads
-V
:
annotation characters (phospho-sites, variation) in
query/library for alignments
-z
:
[1] statistics estimation method: 1-6-regression,
MLE, etc.; 11-16-estimates from shuffled library
sequences; 21-26-E2()-stats from shuffled high-
scoring sequences;
-comp_based_stats
-Z
:
[library entries] database size for E()-value
-dbsize
Summary—Both BLAST and FASTA provide a comprehensive
set of protein:protein, translated-DNA:protein, and DNA:DNA
sequence similarity searching programs. The BLAST package
extends the heuristic BLAST approach [
6
,
8
] in two directions:
psiblast
for more sensitive iterated protein sequence compari-
sons, and
blastn -megablast
, for rapid mapping of DNA
sequences against genomes. Recent improvements in the BLAST
programs have
focused on improved statistical
estimates,