Biology Reference
In-Depth Information
W
1
þ
W
2
þ
W
3
¼
1
is the amino acid similarity score
between
x
i
and
y
j
, which is an element at the
i
-th row and
j
-th
column of the
n
1
In the formula (Eq.
2
),
s
ð
x
i
y
j
Þ
;
n
2
amino acid substitution matrix
s
. Similarly,
SS
ð
ss
ð
x
i
Þ
;
ss
ð
y
j
ÞÞ
is the similarity score between the secondary struc-
ture (ss
) of residue
x
i
in protein
X
and that of residue
y
j
in
protein
Y
according to the secondary structure similarity matrix SS,
SA
ð
x
i
Þ
ð
sa
ð
x
i
Þ
;
sa
ð
y
j
ÞÞ
is the similarity score between the relative solvent
accessibility (sa
) of residue
x
i
in protein
X
and that of residue
y
j
in protein
Y
according to the solvent accessibility similarity matrix
SA.
W
1
;
ð
x
i
Þ
W
3
are weights for the amino acid similarity score,
secondary structure similarity score and solvent accessibility simi-
larity score. The secondary structure and solvent accessibility can
be automatically predicted by PSpro2.0 [
13
](
http://sysbio.rnet.
missouri.edu/multicom_toolbox/
)
using a multi-threading tech-
nique implemented in MSACompro, or alternatively be provided
by a user. The three weights
W
1
;
W
2
;
W
3
are set to 0.4, 0.5, and
0.1 by default, and can be adjusted by users as well. Referring to
MSAprobs,
W
2
;
is a parameter measuring the deviation between
suboptimal and optimal alignments, gap
β
ð
gap
0
Þ
is the gap open
penalty, and ext
is the gap extension penalty. We set these
three parameters the same as the values used in MSAprobs.
Gonnet 160 matrix was used as a substitution matrix to gener-
ate the similarity scores between two amino acids in proteins [
14
].
In addition, we designed a simple 3
ð
ext
0
Þ
3 secondary structure simi-
larity matrix SS, containing the similarity scores of three kinds of
secondary structures (
E
,
H
,
C
) as follows:
2
4
3
5
;
100
010
001
SS
¼
where two identical secondary structures receive a score of 1 and
otherwise receive a score of 0.
Similarly, we also came up with a 2
2 solvent accessibility
similarity matrix SA, consisting of the similarity scores of two types
of relative solvent accessibilities (
e
,
b
) as follows:
"#
10
01
SA
¼
;
where two identical solvent accessibility receive a score of 1 and
different ones a score of 0. Applying more advance scoring matrices
defined in [
15
] may lead to further improvement.
Each posterior residue-residue alignment probability element
in the first kind of posterior probability matrix (
P
XY
) can be calcu-
lated from the partition function as: