Biology Reference
In-Depth Information
that the scores can be computed in sequence-length-size memory,
Hirschberg's algorithm can be run in linear space.
At the implementation level, different versions of the linear-
space algorithm are available [ 11 , 12 ]. As a variation, the median
edge can be found by only using a forward scan of score computa-
tion [ 13 ]. In this version, the forward score computation continues
when the median line is reached. For each cell in the
bj
X
j=
2
1th
line, S F
are computed and the median edges result-
ing in each one of these scores are recorded. Clearly an additional
sequence-length memory space is needed for this ancestral records.
At iteration k , for each cell in the
ðbj
X
j=
2
1
;
j
Þ
th line, the cell
visited by the optimal path in the previous line is determined by
the score computation. Therefore, the median edge for the current
cell can be updated at each iteration. At the final iteration, the
median edge assigned to the lower-right cell is determined to be
the median edge belonging to the solution path.
The linear-space algorithm using median edge candidate
records provides a space-time trade-off flexibility. The algorithm
can be generalized to m splits of subproblems instead of binary
splits at each recursion. Clearly, keeping the records of the edges
crossing the
ðbj
X
j=
2
k
Þ
th line, we can detect the edge belonging to
the optimal path at that portion. In a forward score computation
pass, we can keep records of such edges for every
bj
X
j
/ m
c
bj
X
j
/ m
c
lines
(2
). This would require m sequence size memory
spaces. At each iteration m mid-edges are discovered, and m sub-
problems are defined. For iteration k , m k subproblems are defined
and each problem requires ( m k ) 2 fold less computation than the
original problem. In this case, the amount of computation is
S
m
j
X
j
¼ P 0 S =
3 k
. This space-time trade-
off might be important in practical applications.
As we have seen the
¼ S þ S
k
1
Þ
Hirschberg
n 2 ) time and
n ) space algorithms are
available by affine gap penalties and Hirschberg's algorithm for
global alignments, the semi-global alignment extension is straight-
forward. Initializing the first row and columns of S , V , and H
matrices and defining g ( p )
0 for the last rows and columns is
sufficient. On the other hand, additional requirements are needed
to perform local alignments in linear time [ 14 , 15 ]. For calculating
the scores, the same recursive function equation 12 with the
Smith-Waterman modification is used. Here the scoring modifica-
tion is setting the negative scores to zero. The linear-space scoring
memory is the same as the global alignment case, and the main
difference is a root cell is recorded for computed cell c ( i , j ). Recall
that the Smith-Waterman algorithm terminates a local alignment
when a zero-scored ending cell is met during the backward scan
phase. The root of c ( i , j ) is simply the zero-scored starting point of
the optimal path passing through c ( i , j ). The root is empty when
S ( i , j )
¼
0 and it is initialized when a cell attains a positive score
with an edge coming from a zero-scored parent. This root is
¼
Search WWH ::




Custom Search