Biology Reference
In-Depth Information
methods to do so. A description of some of the problems encountered
follows.
As we have seen, there is always the possibility that in the course
of evolution, nucleotides have been inserted into a gene or
deleted from it (similarly, that amino acids have been inserted or
deleted in a peptide sequence). Since it is often impossible to say
beforehand which of the two events took place, the term “indel”
(insertion/deletion) is commonly used. Phylogenetically, a col-
umn where there is a nucleotide or an amino acid in one
sequence and an indel in another is analogous to a morphologi-
cal character which is present in one species but not in another.
As we have seen in Fig. 2, an indel in a multiple alignment is rep-
resented by a gap (“-”) in some of the sequences and by a
nucleotide (or an amino acid) in others. As a further complica-
tion, an insertion or deletion event is not necessarily limited to a
single site: several nucleotides or amino acids may be inserted or
deleted at once.
Of course, a multiple alignment is not meant to produce indels
ad libitum , as this would be contrary to biological common
sense and to the principle of parsimony. Accordingly, the clever
analyst and automated algorithms allocate a “penalty” for gaps.
Once a scoring system for mismatched nucleotides and a penalty
for indels have been defined, finding the optimal alignment is
an apparently straightforward mathematical problem. Solving it
for two sequences is fairly simple and can be done at tremen-
dous speeds by today's computers. However, the computation
time and memory needed for the calculations grow exponen-
tially with the number of sequences, thus posing a major
challenge to computers. Several shortcuts or heuristics (Greek
heuriskein (“to find”)) have been devised to speed up the
process, but they all work at the cost of alignment quality.
Consequently, many scientists in phylogenetics and in other
domains of biology choose to manually improve computer-
generated alignments, drawing on their biochemical and taxonomic
knowledge base.
Search WWH ::




Custom Search