Biomedical Engineering Reference
In-Depth Information
structural features found in intermediate structures. 26 This Resolution
Adapted Structural RECombination (RASREC) protocol 123 implements a
genetic algorithm to iterate multiple rounds of structure determination in
which structural features identified in previous rounds are recombined. To
improve sampling of non-local b-sheet topologies it uses broken chain folding
kinematics which hold pairings in place, 124 which have been identified in
previous rounds of simulation. We showed that the improved sampling of
RASREC is essential in obtaining accurate structures over a benchmark set of
11 proteins in the 15-25 kDa size range using chemical shifts, backbone RDCs
and H N -H N NOE data; in the majority of cases the improved sampling
methodology makes a larger contribution to convergence than incorporation
of additional experimental data. 123 Experimental data are invaluable for
guiding sampling to the vicinity of the global energy minimum, but for larger
proteins, the standard CS-ROSETTA fragment assembly protocol does not
converge on the native minimum even with experimental data and the more
powerful RASREC approach is necessary to converge to accurate solutions. 123
4.5 Concluding Remarks
Since the beginning of computational structural biology 125 one has exploited
that imposing common knowledge about the biopolymer, such as bond
lengths, angles and atomic radii leads to super-resolution, where the
coordinate accuracy is better than the resolution limit of the data. 126 This is
true for X-ray crystallography, where coordinate accuracies in the sub-
˚ ngstr ¨ m range are routinely reached from diffraction data with 1-2 ˚
resolution, and for conventional NMR structures where an estimated 1-2 ˚
coordinate accuracy is reached from NOE distance restraints with 2-3 ˚
resolution. However, as discussed above, the synergy between physico-
chemical knowledge and integrated experimental data can only be realised if
the native energy basin is reached during sampling.
Thus, we have introduced the notion of instructiveness of restraints to
characterise their ability to guide a structure calculation efficiently towards the
native energy basin. We further define a sparse data set as one for which the
instructive restraints are insufficient in number or resolution to restrict the
sampling to the native energy basin. Hence, a vast area of conformational
space has to be sampled, which can resemble the metaphorical search for a
needle in a haystack. Only a very small hyper-volume of 3-4 ˚ around the
native structure yields a consistent energy signal. If this low-energy area is
missed entirely, calculations might remain un-converged, or worse, converge
towards an alternative low-energy region. To rule out the latter case, it is
possible to check for agreement of force-field and experimental data as
suggested in Raman et al. 26 However, it would be far better if algorithms were
able to consistently find all low-energy regions. Improvements of optimisation
methods for rugged energy landscapes will thus yield the crucial methodolo-
gical advances for structure calculation from sparse data. 123
Search WWH ::




Custom Search