Biology Reference
In-Depth Information
Table 3
FASTA-specific programs
FASTA a program
Compare a protein sequence to a protein sequence database
or a DNA sequence to a DNA sequence database using
the Smith-Waterman [ 2 ] algorithm
ssearch
ggsearch / glsearch
Compare a protein sequence to a protein sequence database or
a DNA sequence to a DNA sequence database using global:global
or global:local alignment
Compare two protein sequences or DNA sequences reporting
all significant non-overlapping local alignments using the
Waterman-Eggert algorithm [ 5 ] as implemented by Huang [ 26 ]
( sim4 )
lalign
fasts/m / tfasts/m
Compare a set of short un-ordered [S] or ordered [M] peptides or
oligonucleotides to a protein/translated DNA or DNA database [ 27 ]
fastf / tfastf Compare a set of short “mixed peptide” sequences to a protein
or translated DNA [ 28 ]
a The names of the FASTA programs are typically followed by a major version number, e.g., fasta36 or
ssearch36 . These numbers are not shown
of the Waterman-Eggert algorithm for non-overlapping local
alignments [ 5 ] to identify internally repeated domains.
The local alignment strategies used by BLAST and FASTA are
ideal for searches for shared homologous domains, or with partial
query sequences, because they identify the best local alignment
between two sequences, ignoring the unrelated sequence context.
Global sequence alignment programs require alignments to extend
from the beginning to the end of the sequences and can be more
effective in capturing conserved domain morphology over an entire
protein. The FASTA package also offers two optimal global align-
ment programs. ggsearch computes an alignment score that is
global for both the query sequence and library sequence; it is
particularly useful for functional inference, since it requires all the
domains in the homologous protein to be present. glsearch
calculates an alignment that is global in the query sequence (e.g.,
a full-length domain) but can be local in the library sequence.
Because of its requirement for global similarity, ggsearch only
aligns library sequences that are between 75 and 133 % the length
of query; likewise glsearch only aligns library sequences that are
more than
75 % the length of the query.
The FASTA package also includes several programs designed to
align unordered short peptides ( fasts ) or ordered sets of non-
contiguous oligonucleotides ( fastm ). fasts is particularly useful
for aligning the peptides produced by Mass Spectrometry proteo-
mic sequencing.
Search WWH ::




Custom Search