Biology Reference
In-Depth Information
Once the partition function is constructed, the posterior
probability of x i aligned to y j can be computed as
Z i 1 ; j 1 Z 0 i þ 1 ; j þ 1
Z
¼
ðÞ
e sx i ;y j
=
T
Px i
y j
;
(9)
where Z 0 M i;j is the partition function of alignments of subsequences
x i ... m and y j ... n beginning with x i paired with y j and m and n are
lengths of x and y respectively. This can be computed using standard
backward recursion formulas [ 3 ]. In the above equation Z i 1 ;j 1 Z
=
and Z 0M
i
represent the probabilities of feasible suboptimal
alignments (as determined by the T parameter) of x 1 ... i 1 and
y 1 ... j 1 ,and x i + 1 ... m and y j + 1 ... n respectively, where m and n are
lengths of x and y respectively. Thus, the equation weighs alignments
according to their partition function probabilities and estimates
Px i
1 Z
=
þ
1
;
j
þ
y j as the sum of probabilities of all alignments where x i is
paired with y j .
, we define the
Given the posterior probability matrix Px i
y j
2.4 Maximal
Expected Accuracy
Alignment
expected accuracy of the alignment of x and y as
1
X
:
a j
Px i
y j
2
x
;
y
(10)
min
f
j
x
jj
y
j
g
x i
y j 2
a
The maximum expected accuracy alignment score is computed
by dynamic programming using the following recurrence described
in Durbin [ 3 ].
for i
¼
j
x
j
1to
for j
¼
1to
j
y
j
<
:
=
;
Ai
ð
1
;
j
1
Þ þ
Px i
y j
Ai
ðÞ¼
;
j
max
Ai
ð
1
;
j
Þ
:
(11)
Ai
ð
;
j
Þ
1
The first row and column of A are set to 0. The alignment score
is given by A
denote the lengths of
sequences x and y. The actual alignment of x and y can be recovered
by keeping track of which cell the maximum value is obtained from
for each entry of A [ 3 ].
Both Probcons and Probalign first estimate posterior probabil-
ities for amino acid residues for all pairs of protein sequences in the
input. Probcons introduced a number of new approaches for con-
structing a multiple alignment with posterior probabilities for all
pairs of sequences. It first performs a probabilistic consistency
transformation to improve posterior probabilities with the aid of a
third sequence [ 12 ]. It then adapts three standard approaches in
multiple sequence alignment, namely construction of a guide-tree,
ð
j
x
j; j
y
j
Þ
where
j
x
j
and
j
y
j
Search WWH ::




Custom Search