Information Technology Reference
In-Depth Information
X
X
20
i
¼
1
a
i
Pi
20
i
¼
1
a
i
p
rel
ð
i
;
j
Þ
ð
!
j
Þ¼
;
ð
1
:
4
Þ
p
i
This leads to the following scoring function.
log
P
20
log
X
20
i
¼
1
a
i
p
rel
ð
i
;
j
Þ
i
¼
1
a
i
Pi
!
j
ð
Þ
score
ðÞ¼
a;
j
¼
ð
1
:
5
Þ
p
j
p
i
p
j
Again, summing up Eq. (
1.5
) over all aligned positions yields a score for the
whole alignment. Similar to sequence alignment, dynamic programming can be
used to generate an optimal alignment between one sequence and one pro
le for a
scoring function de
ned in Eq. (
1.3
) or Eq. (
1.5
).
1.4.5 Scoring Function for Pro
le-Pro
le Alignment
and Comparison
The sequence-pro
le scoring function de
ned in Eq. (
1.5
) can be extended to score
a pro
le-pro
le alignment. Let X and Y be two aligned pro
le columns with amino
acid probability distribution
20), respec-
tively. The following log average score, which is a generalization of Eq. (
1.5
), can
be used to estimate the similarity between these two pro
a
i
(i
¼
1
;
2
;
...
;
20) and
b
j
(j
¼
1
;
2
;
...
;
le columns.
log
X
X
20
20
j
¼
1
a
i
b
i
p
rel
ð
;
Þ
i
j
LogAverageSco
a; ðÞ¼
ð
1
:
6
Þ
p
i
p
j
i
¼
1
Some methods also use the following average mutation score to measure the
similarity of these two pro
le columns.
X
X
20
20
j
¼
1
a
i
b
i
log
p
rel
ð
;
Þ
i
j
AverageSco
a; ðÞ¼
ð
1
:
7
Þ
p
i
p
j
i
¼
1
ned in Eqs. (
1.6
) and (
1.7
), the following dot
product and Jensen-Shannon scores are also proposed in literature [
56
].
Besides the scoring functions de
Dot product
Calculating dot product
is the simplest and fastest approach to
compare two pro
le columns [
57
]. This method calculates the similarity of two
aligned pro
le columns as follows.
Search WWH ::
Custom Search