Biology Reference
In-Depth Information
The three inputs are respectively the path for Pspro bin
directory, the given file of the target multiple sequences in fasta
format and the output multiple sequence alignment file for target
proteins. An example command is as follows:
./auto_run_msacompro.pl
/storage/shared/pspro2/bin/
../test/BB40004.fasta ../test/BB40004.msa
There are ways for users to run MSACompro based on their
own structural information data gathered in advance, which are
described in the Readme.txt file.
3 Methods
Fig. 1 shows the workflow of MSACompro method. Given input
multiple protein sequences, pairwise posterior probability matrices
are first generated based on a partition function which integrates
the predicted structural information and pairwise Hidden Markov
Model. Then pairwise distance matrices between the proteins are
constructed by combining both the posterior probability matrices
and newly introduced contact map similarity matrices. Based on the
distances between the protein pairs, a guide tree is built up and the
posterior probability matrices are transformed by a weighting
scheme. Finally, a progressive alignment and iterative alignment
refinement were performed to get a final multiple sequence
alignment. More details are discussed as follows.
3.1 Calculation of
Pairwise Posterior
Probability Matrices
Integrating the
Predicted Structural
Information
Fig. 1 The workflow of MSACompro method
Search WWH ::




Custom Search