MSACompro: Improving Multiple Protein Sequence Alignment by Predicted Structural Features - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

Suppose we get the optimal global alignment of X and Y by

tracing back through AS as follows:

x 1 x 2

...

x m

x p

x n 1

...

y n 2

For the purpose of calculating CMscore

y 1 ...

y k y kþ 1 ... ...

, a new align-

ment is generated after removing the pairs containing gaps:

x 1

;

x m

x n 1

...

y n 2

We also denote the new alignment as:

x 0 1 x 0 2 ...

y 1 ...

y kþ 1 ...

x 0 n

y 0 n ;

where n is the length of the new alignment without gaps.

From this alignment, we can construct two contact map matri-

ces, CMap X and CMap Y , which consist of predicted contact prob-

ability scores for sequences of X and Y respectively, as follows:

y 1 y 0 2 ...

x 0 11 x 0 12 ...

x 0 1 n

x 0 21 x 0 22 ...

x 0 2 n

CMap X ¼

(6)

x 0 n 1 x 0 n 2 ...

x 0 nn

y 0 11 y 12 ...

y 0 1 n

y 0 21 y 22 ...

y 0 2 n

CMap Y ¼

y 0 n 1 y 0 n 2 ...

y 0 nn

x ij is the predicted contact probability score between amino acid

x i and x j in protein sequence X , and similarly, y ij is the predicted

contact probability score between amino acid y i and y j in protein

sequence Y . The residue-residue contact probability scores intro-

duced above are predicted from the protein sequence by NNcon

[ 17 ]( http://sysbio.rnet.missouri.edu/multicom_toolbox/ ). The

contact map correlation score matrix CMap XY is designed in our

MSACompro as the multiplication of CMap X and CMap Y :

CMap XY ¼

CMap X

CMap Y

xy 0 11 xy 12 ...

xy 1 n

xy 0 21 xy 22 ...

xy 2 n

(7)

xy 0 n 1 xy 0 n 1 ...

xy 0 nn

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home