Information Technology Reference
In-Depth Information
3. MRFsearch_Edge_Potential.tar.gz (
1G): this package contains pairwise resi-
due correlation information for the proteins in the database.
4. nr70.tar.gz (
*
1.6G) and nr90.tar.gz (
2.3G): the formatted NR70 and NR90
*
*
protein sequence databases.
5. MRFsearch_Database_List.tar.gz (
1M): the package contains some database
lists. The database lists will be updated weekly.
*
3.3 Feature Files
For each protein there are two feature
file that contains some
basic information about this protein and some predicted local features such as
secondary structure. Notice that not all the features are use in MRFsearch. Below
are some details.
files. One is a *.tgt
Sequence
information
Primal sequence
Pro le
information
NEFF value
The average entropy of the protein sequence pro le
PSSM matrix
Position-speci c score matrix generated by PSI-blast with
5 iterations and E-value 0.001
PSFM matrix
Position-specific frequency matrix generated by PSI-blast
with 5 iterations and E-value 0.001
HMM matrix
The Hidden Markov Model generated by buildali2.pl in
the HHpred package
Structure
information
3-class
secondary
structure
The predicted 3-class secondary structure types by
PSIPRED [ 1 ]
8-class
secondary
structure
The predicted 8-class secondary structure types by
RaptorX-SS8 [ 2 ]
3-class solvent
accessibility
The predicted 3-state solvent accessibility, which is
discretized into three equal-frequency states: buried,
intermediate and exposed
Disorder
prediction
The predicted disorder information by DISOPRED [ 3 ]
file contains pairwise residue correlation information produced
by the EPAD package. For a protein of length N, it contains N
The other feature
2 rows,
where each rows contains 9 numbers representing the probability of 13 distance
bins (3
ð
N
6
Þ =
4, 4
5, 5
6,
,14
15, and >15
Å
).
-
-
-
-
Search WWH ::




Custom Search