Mathematical Modeling and Optimization Methods for De Novo Protein Design - Systems Biology: Networks, Models, and Applications

Biology Reference

In-Depth Information

proteins, RosettaDesign yielded sequences of 70-80% identity as the

final results of energy optimization when multiple runs were started

with different random sequences [41]. Originated in genetics and

evolution, genetic algorithms generate a multitude of random amino

acid sequences and exchange for a fixed template. Sequences with low

energies form hybrids with other sequences while those with high

energies are eliminated in an iterative process which only terminates

when a converged solution is attained [42]. Desjarlais and Handel [20]

have applied a two-stage combination of Monte Carlo and genetic

algorithms to design the hydrophobic core of protein 434cro. Both

Monte Carlo methods and genetic algorithms can search larger combi-

natorial space compared to deterministic methods, but they share

the common disadvantage of lacking consistency in finding the global

minimum in energy.

Recent methods attempt to avoid the problem of optimizing residue

interactions by manipulation of the shapes of free energy landscapes [43].

Another class of methods focus on a statistical theory for combinatorial

protein libraries which provides probabilities for the selection of amino

acids in each sequence position [44-46]. The set of site-specific amino acid

probabilities obtained at the end actually represents the sequence with

the maximum entropy subject to all of the constraints imposed [44,45,47].

This statistical computationally assisted design strategy ( scads ) has been

employed to characterize the structure and functions of membrane

protein KcsA and to enhance the catalytic activity of a protein with a

dinuclear metal center [47]. It has also been used to calculate the iden-

tity probabilities of the varied positions in the immunoglobulin light

chain-binding domain of protein L [45]. Scads serves as a useful frame-

work for interpreting and designing protein combinatorial libraries, as

it provides clues about the regions of the sequence space that are most

likely to produce well-folded structures [48].

Several sequence selection approaches have been tested and validated

by experiment, thereby firmly establishing the feasibility of computa-

tional protein design. The first computational design of a full sequence

to be experimentally characterized was that of a stable zinc-finger fold

(bba) using a combination of a backbone-dependent rotamer library

with atomistic level modeling and a dead-end elimination-based

algorithm [49]. Recently, Kuhlman et al. [50] introduced a computa-

tional framework that iterates between sequence design and structure

prediction, designed a new fold for a 93-residue a/b protein, and validated

its fold and stability experimentally. Despite these accomplishments, the

development of a computational protein design technique to rigorously

address the problems of fold stability and functional design remains a

challenge. As mentioned earlier, one important reason for this is either

the almost universal specification of a fixed backbone, or the use of a

Systems Biology: Networks, Models, and Applications

Search WWH ::

Custom Search

Home