BLAST and FASTA Similarity Searching for Multiple Sequence Alignment - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

Table 3

FASTA-specific programs

FASTA a program

Compare a protein sequence to a protein sequence database

or a DNA sequence to a DNA sequence database using

the Smith-Waterman [ 2 ] algorithm

ssearch

ggsearch / glsearch

Compare a protein sequence to a protein sequence database or

a DNA sequence to a DNA sequence database using global:global

or global:local alignment

Compare two protein sequences or DNA sequences reporting

all significant non-overlapping local alignments using the

Waterman-Eggert algorithm [ 5 ] as implemented by Huang [ 26 ]

( sim4 )

lalign

fasts/m / tfasts/m

Compare a set of short un-ordered [S] or ordered [M] peptides or

oligonucleotides to a protein/translated DNA or DNA database [ 27 ]

fastf / tfastf Compare a set of short “mixed peptide” sequences to a protein

or translated DNA [ 28 ]

a The names of the FASTA programs are typically followed by a major version number, e.g., fasta36 or

ssearch36 . These numbers are not shown

of the Waterman-Eggert algorithm for non-overlapping local

alignments [ 5 ] to identify internally repeated domains.

The local alignment strategies used by BLAST and FASTA are

ideal for searches for shared homologous domains, or with partial

query sequences, because they identify the best local alignment

between two sequences, ignoring the unrelated sequence context.

Global sequence alignment programs require alignments to extend

from the beginning to the end of the sequences and can be more

effective in capturing conserved domain morphology over an entire

protein. The FASTA package also offers two optimal global align-

ment programs. ggsearch computes an alignment score that is

global for both the query sequence and library sequence; it is

particularly useful for functional inference, since it requires all the

domains in the homologous protein to be present. glsearch

calculates an alignment that is global in the query sequence (e.g.,

a full-length domain) but can be local in the library sequence.

Because of its requirement for global similarity, ggsearch only

aligns library sequences that are between 75 and 133 % the length

of query; likewise glsearch only aligns library sequences that are

more than

75 % the length of the query.

The FASTA package also includes several programs designed to

align unordered short peptides ( fasts ) or ordered sets of non-

contiguous oligonucleotides ( fastm ). fasts is particularly useful

for aligning the peptides produced by Mass Spectrometry proteo-

mic sequencing.

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home