Information Technology Reference
In-Depth Information
3.4 MRFsearch Ranking File
After searching the database a ranking
file will be generated as shown in Fig. 3.1 .
The
first line contains the query protein name. The second line shows the query
protein sequence. The third line is the query protein length. The NEFF (number of
effective sequence homologs) in the fourth line is the average Shannon
Sequence
Entropy
le. NEFF is the average number of amino
acid (AA) substitutions across all residues of a protein, ranging from 1 to 20 (i.e.,
the number of AA types). NEFF at one residue is calculated by exp
for a PSI-BLAST sequence pro
ð P k p k ln p k Þ
where p k is the probability for the kth AA type), and NEFF for the whole protein is
the average across all residues. Generally speaking, NEFF is used to quantify the
homologous information content available for a given protein. The larger the NEFF
value, the more homologous information its pro
le contains. The
fifth line contains
the number of proteins searched by MRFsearch.
The meaning of each column is explained as follows.
Column 1 ' No '
Ranking of all the searched proteins.
Column 2
'
Proteins
'
Name of the protein (PDB ID or SCOP protein name) in the
databases.
Column 3
'
P-value
'
The P-value of the alignment. The smaller, the better.
Column 4
'
Score
'
The alignment raw score between the query and subject
proteins.
Column 5
'
Node
'
The accumulative node alignment potential.
Fig. 3.1 An example ranking file generated by MRFsearch
Search WWH ::




Custom Search