Biomedical Engineering Reference
In-Depth Information
Before leaving the discussion of ab initio methods, the Rosetta method is worth mentioning because
it is widely recognized as one of the most promising of the protein structure prediction methods. The
Rosetta method typically involves breaking up a protein of unknown structure into words of three and
nine amino acids in length. The fragment libraries that are used to limit the conformations of these
segments are extracted from one of the online protein structure databases. Monte Carlo methods are
used to identify conformation combinations with the lowest free energy. The sequential construction
of protein structures is repeated thousands of times using independent simulations that start from
different random number seeds. The resulting structures or candidates are clustered and a candidate
from the centers of each of the largest clusters is selected as the predicted structure.
The Rosetta method is based on the assumption that the distribution of conformations sampled by a
local amino acid sequence is taken as an approximation of the set of local conformations that a given
sequence segment in a protein of unknown structure would have available during the folding process.
Given the possible conformations that each segment can assume, the combination of local
conformations with the lowest overall energy is taken as a candidate structure.
Although this method, which has been used with very good results with protein segments of up to
about 90 residues in length, is often billed as form of ab initio protein structure prediction, it actually
represents a hybrid method because it incorporates data from a library of protein structures.
Heuristic Methods
While ab initio methods of protein structure prediction can be used to identify novel structures from
sequence data alone, they're too computationally intensive to work with all but the smallest proteins.
For most proteins of unknown structure, short of X-ray crystallography and nuclear magnetic
resonance (NMR) studies, heuristic methods offer the fastest, most accurate means of deriving
structure from amino acid sequence data. Heuristic methods use a database of protein structures to
make predictions about the structure of newly sequenced proteins. A basic premise of heuristic
methods is that most newly sequenced proteins share structural similarities with proteins whose
structures and sequences are known, and that these structures can serve as templates for new
sequences. It's also assumed that because relatively substantial changes in amino acid sequence may
not significantly alter the protein structure, similarity in sequences implies similarity in structure.
The primary limitation of a heuristic approach to protein structure prediction is that it can't model a
novel structure. There must be a suitable template—meaning that the sequences of the template and
the new protein can be aligned—available to work with as a starting point. For this reason, heuristic
approaches often have difficulty with novel mutations that induce structural changes in the new
Search WWH ::




Custom Search