Introduction - Protein Homology Detection Through Alignment of Markov Random Fields

Information Technology Reference

In-Depth Information

i ¼ 1 a i Pi

i ¼ 1 a i p rel ð

;

Þ¼

;

p i

This leads to the following scoring function.

log P 20

log X

i ¼ 1 a i p rel ð i ; j Þ

i ¼ 1 a i Pi ! j

score

ðÞ¼

p j

p i p j

Again, summing up Eq. ( 1.5 ) over all aligned positions yields a score for the

whole alignment. Similar to sequence alignment, dynamic programming can be

used to generate an optimal alignment between one sequence and one pro

le for a

scoring function de

ned in Eq. ( 1.3 ) or Eq. ( 1.5 ).

1.4.5 Scoring Function for Pro

le-Pro

le Alignment

and Comparison

The sequence-pro

le scoring function de

ned in Eq. ( 1.5 ) can be extended to score

a pro

le-pro

le alignment. Let X and Y be two aligned pro

le columns with amino

acid probability distribution

20), respec-

tively. The following log average score, which is a generalization of Eq. ( 1.5 ), can

be used to estimate the similarity between these two pro

a i (i

;

; ... ;

20) and

b j (j

;

; ... ;

le columns.

log X

j ¼ 1 a i b i

p rel ð

;

LogAverageSco a; ðÞ¼

p i p j

i ¼ 1

Some methods also use the following average mutation score to measure the

similarity of these two pro

le columns.

j ¼ 1 a i b i log

p rel ð

;

AverageSco a; ðÞ¼

p i p j

i ¼ 1

ned in Eqs. ( 1.6 ) and ( 1.7 ), the following dot

product and Jensen-Shannon scores are also proposed in literature [ 56 ].

Besides the scoring functions de

Dot product Calculating dot product

is the simplest and fastest approach to

compare two pro

le columns [ 57 ]. This method calculates the similarity of two

aligned pro

le columns as follows.

Protein Homology Detection Through Alignment of Markov Random Fields

Search WWH ::

Custom Search

Home