Biomedical Engineering Reference
In-Depth Information
The low complexity filter masks off the regions of the query sequence (the sequence entered in the
"Search" field) that have low compositional complexity. Areas of low complexity, such as those
composed of only a few characters repeated, are not likely to be biologically interesting. The "Human
repeats" option masks repeating sequences, speeding the search, especially against databases
containing sequences with large numbers of repeats. The "Mask for lookup table only" option is an
experimental mask that eliminates hits based on low-complexity sequences. The "Mask lower case"
option causes only the uppercase sequences in the "Search" field to be executed.
The "Expect" field represents the statistical significance threshold for reporting matches against
database sequences. The lower the threshold, the more stringent the alignment criteria, resulting in
fewer chance matches being reported. The default value is "10," meaning that of the reported match
values, 10 will occur by chance alone. In comparison, a search with an "Expect" value of "1" would
likely return only 1 result by chance alone. Too small a value in the "Expect" field will result in too
few search results. "Word Size" can be set to 7, 11, or 15 nucleotides through a pull-down menu.
In addition to the pull-down menu and checkbox options, the "Other advanced" field accepts
command-line entry of advanced options, including the cost to open and extend gaps, the
specification of penalties for nucleotide mismatch, the reward for a match, and the ability to adjust
output formatting. The "Other advanced" field can also be used to override many of the program
default settings. For example, the command "-W12" sets the word size to 12, an option not available
through the pull-down menus.
The other major options of BLAST deal with formatting the output. Formatting options range from
color graphics in which the colors represent alignment scores to page formatting. Perhaps the most
useful output utility is a Database Linkout feature, which provides reference links from the BLAST
Results to various NCBI databases and other resources.
BALSA
The BALSA tool, from the Center for Bioinformatics at Rensselaer and Wadsworth Center of the New
York Department of Health, provides Web-based access to Bayesian-based sequence alignment (see
Figure 8-15 ). A virtually identical tool, BALSA Database Query, is available for database queries using
either the PDB or the Structural Classification of Proteins (SCOP) databases.
Figure 8-15. BALSA Pairwise Sequence Alignment Tool. This Web-based tool
is provided by the Center for Bioinformatics at Rensselaer and Wadsworth
Center of the New York Department of Health.
Search WWH ::




Custom Search