Dynamic Programming - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

referred as to be aligned with a gap. A popular visual representation

of sequence alignments is to print the sequence couples as two lines,

in which the matching residues are printed in the same column. The

gaps, corresponding to the unmatched residues, are usually repre-

sented with dashes “-.” Since the matchings are ordered, the

primary structure of the sequences, which is the permutation of

the letters, is preserved. In other words, it is possible to edit

sequence X by substituting, inserting, and deleting residues and

converting it to Y . An alignment offers a map indicating which

residues are conserved, substituted, deleted, and inserted. As an

example, assume two DNA fragments X

GACAT and Y

GAGACAT . An alignment A of X and Y is

GAGACAT

GACA- -T.

Here, according to the alignment the first two, the fourth, and the

last residues of Y are conserved, the third one is substituted to a C ,

and fifth and the sixth ones are deleted.

This definition of sequence alignment does not impose any

restriction on the matching positions. The two extreme cases are

the alignment that all the residues of X are aligned to gaps placed

before the first residue of Y , and the one that all X residues are

matched with gaps after the last residue of Y . The intermediate

alignments, ranging between these extremes are all valid align-

ments. To represent the alignments graphically, a rectangular grid

shown in Fig. 1 has been used traditionally as it provides a simple

understanding of the alignment space, as well as the dynamic

programming solutions proposed. The rows of the grid correspond

to the residues of a sequence, where the columns represent the

residues of the second sequence to be aligned. In this structure, a

cell or vertex c ( i , j ), (0

) is connected to its

surrounding neighbors,

;

Þ; ð

;

Þ; ð

;

, with

incoming

directed

edges,

and

;

Þ; ð

;

Þ;

, with outgoing directed edges. We can show that each

alignment of X and Y corresponds to a path from the upper-left

corner to the lower-right corner of this graph by the following

assignments. A diagonal edge from the vertex c

;

c ( i , j ) corresponds to the aligned pair X i , Y j , a horizontal edge

from c ( i , j

;

1) to c ( i , j ) corresponds to the pairing of Y j with a

gap, and a vertical edge from c ( i

1, j )to c ( i , j ) corresponds to

the pairing of X i with a gap. We can represent any alignment A by

reading the pairings of an alignment from left to right in order, and

converting it to an edge sequence E . The adjacent pairs of A must

involve at least one incrementation of i and j since gap-to-gap

pairings are not valid. Therefore, the adjacent edges in E are also

adjacent in the graph, i.e., two adjacent edges in E are incoming and

outgoing edges of the same vertex in the grid. This consecutive

placement implies that E is a path from the upper-left to the

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home