Biology Reference
In-Depth Information
referred as to be aligned with a gap. A popular visual representation
of sequence alignments is to print the sequence couples as two lines,
in which the matching residues are printed in the same column. The
gaps, corresponding to the unmatched residues, are usually repre-
sented with dashes “-.” Since the matchings are ordered, the
primary structure of the sequences, which is the permutation of
the letters, is preserved. In other words, it is possible to edit
sequence X by substituting, inserting, and deleting residues and
converting it to Y . An alignment offers a map indicating which
residues are conserved, substituted, deleted, and inserted. As an
example, assume two DNA fragments X
¼
GACAT and Y
¼
GAGACAT . An alignment A of X and Y is
GAGACAT
GACA- -T.
Here, according to the alignment the first two, the fourth, and the
last residues of Y are conserved, the third one is substituted to a C ,
and fifth and the sixth ones are deleted.
This definition of sequence alignment does not impose any
restriction on the matching positions. The two extreme cases are
the alignment that all the residues of X are aligned to gaps placed
before the first residue of Y , and the one that all X residues are
matched with gaps after the last residue of Y . The intermediate
alignments, ranging between these extremes are all valid align-
ments. To represent the alignments graphically, a rectangular grid
shown in Fig. 1 has been used traditionally as it provides a simple
understanding of the alignment space, as well as the dynamic
programming solutions proposed. The rows of the grid correspond
to the residues of a sequence, where the columns represent the
residues of the second sequence to be aligned. In this structure, a
cell or vertex c ( i , j ), (0
i
j
X
j
,0
j
j
Y
j
) is connected to its
surrounding neighbors,
ð
i
1
;
j
Þ; ð
i
1
;
j
1
Þ; ð
i
;
j
1
Þ
, with
incoming
directed
edges,
and
to
ð
i
þ
1
;
j
Þ; ð
i
þ
1
;
j
þ
1
Þ;
ð
, with outgoing directed edges. We can show that each
alignment of X and Y corresponds to a path from the upper-left
corner to the lower-right corner of this graph by the following
assignments. A diagonal edge from the vertex c
i
;
j
þ
1
Þ
to
c ( i , j ) corresponds to the aligned pair X i , Y j , a horizontal edge
from c ( i , j
ð
i
1
;
j
1
Þ
1) to c ( i , j ) corresponds to the pairing of Y j with a
gap, and a vertical edge from c ( i
1, j )to c ( i , j ) corresponds to
the pairing of X i with a gap. We can represent any alignment A by
reading the pairings of an alignment from left to right in order, and
converting it to an edge sequence E . The adjacent pairs of A must
involve at least one incrementation of i and j since gap-to-gap
pairings are not valid. Therefore, the adjacent edges in E are also
adjacent in the graph, i.e., two adjacent edges in E are incoming and
outgoing edges of the same vertex in the grid. This consecutive
placement implies that E is a path from the upper-left to the
Search WWH ::




Custom Search