Biology Reference
In-Depth Information
Suppose we get the optimal global alignment of
X
and
Y
by
tracing back through AS as follows:
x
1
x
2
...
x
m
x
p
x
n
1
...
...
y
n
2
For the purpose of calculating CMscore
y
1
...
y
k
y
kþ
1
...
...
, a new align-
ment is generated after removing the pairs containing gaps:
x
1
ð
X
;
Y
Þ
x
m
x
n
1
...
...
y
n
2
We also denote the new alignment as:
x
0
1
x
0
2
...
y
1
...
y
kþ
1
...
x
0
n
y
0
n
;
where
n
is the length of the new alignment without gaps.
From this alignment, we can construct two contact map matri-
ces, CMap
X
and CMap
Y
, which consist of predicted contact prob-
ability scores for sequences of
X
and
Y
respectively, as follows:
y
1
y
0
2
...
2
3
x
0
11
x
0
12
...
x
0
1
n
4
5
x
0
21
x
0
22
...
x
0
2
n
.
CMap
X
¼
(6)
x
0
n
1
x
0
n
2
...
x
0
nn
2
4
3
5
y
0
11
y
12
...
y
0
1
n
y
0
21
y
22
...
y
0
2
n
.
CMap
Y
¼
y
0
n
1
y
0
n
2
...
y
0
nn
x
ij
is the predicted contact probability score between amino acid
x
i
and
x
j
in protein sequence
X
, and similarly,
y
ij
is the predicted
contact probability score between amino acid
y
i
and
y
j
in protein
sequence
Y
. The residue-residue contact probability scores intro-
duced above are predicted from the protein sequence by NNcon
[
17
](
http://sysbio.rnet.missouri.edu/multicom_toolbox/
). The
contact map correlation score matrix CMap
XY
is designed in our
MSACompro as the multiplication of CMap
X
and CMap
Y
:
CMap
XY
¼
CMap
X
CMap
Y
2
4
3
5
xy
0
11
xy
12
...
xy
1
n
xy
0
21
xy
22
...
xy
2
n
.
(7)
¼
xy
0
n
1
xy
0
n
1
...
xy
0
nn