Information Technology Reference
In-Depth Information
X
X
20
i ¼ 1 a i Pi
20
i ¼ 1 a i p rel ð
i
;
j
Þ
ð
!
j
Þ¼
;
ð
1
:
4
Þ
p i
This leads to the following scoring function.
log P 20
log X
20
i ¼ 1 a i p rel ð i ; j Þ
i ¼ 1 a i Pi ! j
ð
Þ
score
ðÞ¼
a;
j
¼
ð
1
:
5
Þ
p j
p i p j
Again, summing up Eq. ( 1.5 ) over all aligned positions yields a score for the
whole alignment. Similar to sequence alignment, dynamic programming can be
used to generate an optimal alignment between one sequence and one pro
le for a
scoring function de
ned in Eq. ( 1.3 ) or Eq. ( 1.5 ).
1.4.5 Scoring Function for Pro
le-Pro
le Alignment
and Comparison
The sequence-pro
le scoring function de
ned in Eq. ( 1.5 ) can be extended to score
a pro
le-pro
le alignment. Let X and Y be two aligned pro
le columns with amino
acid probability distribution
20), respec-
tively. The following log average score, which is a generalization of Eq. ( 1.5 ), can
be used to estimate the similarity between these two pro
a i (i
¼
1
;
2
; ... ;
20) and
b j (j
¼
1
;
2
; ... ;
le columns.
log X
X
20
20
j ¼ 1 a i b i
p rel ð
;
Þ
i
j
LogAverageSco a; ðÞ¼
ð
1
:
6
Þ
p i p j
i ¼ 1
Some methods also use the following average mutation score to measure the
similarity of these two pro
le columns.
X
X
20
20
j ¼ 1 a i b i log
p rel ð
;
Þ
i
j
AverageSco a; ðÞ¼
ð
1
:
7
Þ
p i p j
i ¼ 1
ned in Eqs. ( 1.6 ) and ( 1.7 ), the following dot
product and Jensen-Shannon scores are also proposed in literature [ 56 ].
Besides the scoring functions de
Dot product Calculating dot product
is the simplest and fastest approach to
compare two pro
le columns [ 57 ]. This method calculates the similarity of two
aligned pro
le columns as follows.
Search WWH ::




Custom Search