Biomedical Engineering Reference
In-Depth Information
G) ATTCG G CATT CAG AG C G AG A
H) ATTCG A CATT GCT AG T G GT A
Unlike the previous cases, there are no relatively long runs of character pairings, and the matching
pairs are separated by unaligned characters. The alignment score is 1 point per aligned pair, or 13.
One attempt at visual alignment by adding four gaps into sequence (H) results in:
G) ATTCG G CATT CAGA GCTAG A
I) ATTCG A CATT----GCTAG TGGTA
This alignment results in a score of 12, or 14 alignments minus 2 points for the 4 gaps introduced
into sequence (H), transforming it to sequence (I). In addition, a penalty of -0.5 per character pair is
scored for an inexact match. In the case of sequences (G) and (I), there are 6 inexact matches, for a
penalty of (6 x -0.5 = -3). Using this new alignment-scoring algorithm, and ignoring the length
difference between the two sequences, the alignment score for the (G)-(I) alignment becomes:
Alignment Score = 14 alignments + 4 gaps + 6 inexact matches
= 14 + (4 x -0.5) + (6 x -0.5)
= 14 - 2 - 3
= 9
In this example, adding gaps results in a lower alignment score, illustrating how the relative worth of
exact matches, inexact matches, and gaps determines the eventual alignment of two sequences. For
example, if gaps are penalized heavily and inexact matches are minimally counted, then sequences
will have few gaps.
Although a simple gap penalty of -0.5 point per gap has been used to illustrate the role of alignment
scores on sequence alignment, gap penalty is typically calculated as:
Penalty gap = Cost opening + Cost extension x Length gap
In this formula, Penalty gap is the total gap penalty, Cost opening is the cost of opening a gap in a
sequence, Cost extension is the cost of extending an existing gap by one character, and Length gap is
the length of the gap in characters. The minimum value of Length gap is one. Returning to sequence
pair (E)-(F), assuming that Cost opening is (-0.5) and Cost extension is (-0.5), the gap penalty becomes:
Penalty gap = Cost opening + Cost extension x Length gap
= -0.5 + (-0.5 x 4)
= -2.5
With the expanded method of computing gap penalty, the score becomes 10 + 6 - 2.5 = 13.5 points.
The gap penalty formula can be extended to include a penalty for alignments for the gaps at the end
of a sequence to make the sequences of equal length. However, if the sequences are of very different
lengths, then it probably doesn't make sense to penalize for these end gaps.
Search WWH ::




Custom Search