Information Technology Reference
In-Depth Information
Table 4.16 Contribution of the edge alignment potential and mutual information (MI), measured
by alignment recall improvement on proteins with at least 256 non-redundant sequence homologs
in two benchmarks Set3.6K and Set2.6K
391 pairs in Set3.6K
509 pairs in Set2.6K
Exact match
(%)
4-offset
(%)
Exact match
(%)
4-offset
(%)
Only node potential
59.5
63.4
71.3
75.8
Node + edge potential, no
MI
62.1
66.7
73.5
78.1
Node + edge potential with
MI
65.2
69.8
76.6
81.0
The structure alignments generated by DeepAlign are used as reference alignments
information is mainly useful for proteins with many sequence homologs since it is
close to 0 for proteins with few sequence homologs. As shown in Tables 4.15 and
4.16 , if only the proteins with at least 256 non-redundant sequence homologs are
considered, the improvement resulting from mutual information is
3%.
*
4.7 Running Time
Figure 4.1 shows the running time of MRFalign with respect to protein length. As a
control, we also show the running time of the Viterbi algorithm, which is used by
our ADMM algorithm to generate alignment at each iteration. As shown in this
figure, MRFalign is no more than 10 times slower than the Viterbi algorithm. To
speed up homology detection, we may use the Viterbi algorithm to perform an
initial search without considering edge alignment potential, and keep only top 10 %
of proteins for further examination. Then we run MRFalign to search for homologs
from the top 10 %. Therefore, although MRFalign may be slow compared to the
Viterbi algorithm, empirically we can do homology search only slightly slower than
the Viterbi algorithm.
4.8 Is Our MRFalign Method Overtrained?
We conducted two experiments to show that MRFalign is not overtrained. In the
first experiment, we used 36 CASP10 hard targets as the test data. Since our
training set was built before CASP10 started, we can believe that there is no
redundancy between the CASP10 hard targets and our training data. Using MRF-
align and HHpred, respectively, we search each of these 36 test targets against
Search WWH ::




Custom Search