Dynamic Programming - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

e p; 0 ð

0). Because the paths

following these paths are not punished, the alignments staring or

ending with consecutive gaps are more likely to be found, which is

the desired solution in semi-global alignment problems.

An application of the pairwise alignment algorithm is proposed

by Waterman and Smith [ 6 ] and it is known as the Smith-

Waterman algorithm. The main objective is finding the similar

regions of two sequences instead of aligning them globally. In this

sense, Smith-Waterman algorithm searches for local alignments.

Local alignments are crucial in finding the homologous regions

between proteins and in classifying newly discovered biological

elements. Clearly what we look for in the alignment graph is a

high-scoring path section among all available paths. An algorithm

searching for the highest scoring local alignment should search for

the maximal path that the score difference of the beginning and the

end of the segment is the greatest among all segments (i.e., highest

scoring subpath). Another requirement for such maximal paths is

that a cell visited by the path cannot have a score smaller than the

score of the starting cell. This situation would be contradict with

the definition of a high-scoring alignment, because the section of

the path between the initial cell and the lower scoring cell contri-

butes a negative score. This motivation alters the scoring scheme to

a direction that negatively contributing trends are avoided and they

should not be included in an optimal path.

To satisfy this property, the optimal alignment scores are com-

puted using the same recursive procedure of the Needleman--

Wunsch algorithm is applied with a modification. During the

recursive computation, when the score of a path becomes negative,

the cell associated is set to zero score, to exclude it from optimal

path candidates. In this case, starting from zero-scored cells, an

optimal alignment will attain a high score and the path section

between the maximum score and the zero-scored cell will be the

best-conserved subsequences of

;

Þ¼

e jXj;p ðj

Þ¼

e p;jY j ð

; j

jÞ ¼

the pair of

input sequences.

The scoring function is simply modified to

;

Þþ

e i;j

;

Þ¼

max

(8)

max p S

;

Þþ

e i;j

;

max q S

;

Þþ

e i;j ð

;

and the forward-scan phase is the same as the Needleman-Wunsch

algorithm. In the reverse-scan phase we look for the best scoring

path section. This is found by locating the highest scoring cell and

tracing back to a zero-scoring cell. In Fig. 6 , we can see the result-

ing local alignment when the modified algorithm is run on the same

example used in the Needleman-Wunsch algorithm.

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home