Biology Reference
In-Depth Information
The e-value is the number of hits with the same level of similarity that would be
found by chance if there were no true matches in the database; thus, an e-value
of 0.01 would occur once every 100 searches even when there is no true match
in the database. BLAST searches can be run over the web through the National
Center for Biotechnology Information (NCBI), the European Biotechnology
Institute (EBI), or the DNA database of Japan (DDBJ). Once the sequences have
been obtained with which the data are to be compared, they need to be aligned.
12.6.6 Aligning Sequences
Sequences can be aligned either with other sequences obtained in the proj-
ect or with sequences obtained from databases such as GenBank ( Figure 12.4 ).
Aligning the sequences usually involves computer analyses of the sequences
using one of three major methods for comparing sequence similarity: matrix
plots, global alignments, and local alignments ( Hillis et  al. 1990, 1996 ). Both
alignment and phylogenetic inferences involve assumptions and subjective
decisions ( Hillis et al. 1990, 1996; Howe and Ward 1989; Gribskov and Devereaux
1991; Hall 2011 ). The alignments usually are made based on the assumption of
parsimony. Parsimony dictates that an alignment of sequences is based on the
minimal number of changes needed to transform one sequence into the other.
ClustalW is a commonly used program that aligns DNA (or amino-acid) sequences
in such a way as to maximize the number of residues that match by introducing
gaps or spaces into one or the other sequence. These gaps are assumed to be due
to insertions or deletions that occurred as the sequences diverged from a common
ancestor over evolutionary time ( Thompson et al. 1994, Hall 2011; Larkin et al. 2007 ).
12.6.7 Constructing Phylogenies
What is a tree? It is a method to illustrate relationships among organisms, and
trees can be portrayed in several ways ( Page 2011 ). The number of species in the
tree may be in the tens or tens of thousands. The tree should provide a pattern
of ancestry, divergence and descent, using branches that merge at points that
represent common ancestors, each of which is connected through more-distant
ancestors. The more ancestors that two species share, the more closely related
they are. If two species share a common ancestor but that ancestor is not shared
by any other taxa on the tree, these species are known as “sister taxa.” If a spe-
cies is not linked to any of the other species (other than by a distant ancestor), it
is considered an “out group” ( Gregory 2008 ).
The primary methods of phylogeny construction are parsimony, distance,
likelihood, and Bayesian, with variants within each of these broad categories.
The goal of all methods is to identify the relationships (topology of a tree) that
Search WWH ::




Custom Search