Mathematical Modeling and Optimization Methods for De Novo Protein Design - Systems Biology: Networks, Models, and Applications

Biology Reference

In-Depth Information

Monte Carlo sampling. This is the approach adopted by both Desjarlais

and Handel [20] and Kraemer-Pecore et al. [21]. Under this approach

an ensemble of related backbone conformations close to the template

are generated at random. Then a sequence will be designed for each of

them under the rigid backbone assumption, and finally the backbone

sequence combination with the lowest energy will be selected. For

symmetric proteins, backbone structure can actually be modeled by

parametric fitting, and this should improve computational efficiency.

However, the vast majority of protein structures are nonsymmetric,

which make this parametric approach infeasible. Su and Mayo [17]

overcame this difficulty by treating a-helices and b-sheets as rigid

bodies and designing sequences for several template variations of the

protein Gb1. Farinas and Regan [22] considered a discrete set of

templates when they designed the metal binding sites in Gb1, and they

identified varied residue positions that would have been missed if

average three-dimensional coordinates had been used for calculations.

Harbury et al. [23] incorporated template flexibility through an alge-

braic parameterization of the backbone, when they designed a family

of a-helical bundle proteins with right-handed superhelical twist.

They were able to achieve a root mean square coordinate deviation

between the predicted structure and the actual structure of the de novo

designed protein of around 0.2 Å.

One natural approach to incorporate backbone flexibility is to allow

for variability in each position in the template. The deterministic in silico

sequence selection method, recently proposed by Klepeis et al. [24,25]

using the integer linear optimization technique, takes into account tem-

plate flexibility via the introduction of a distance-dependent force field

in the sequence selection stage. Pairwise amino acid interaction potential,

which depends on both the types of the two amino acids and the dis-

tance between them, was used to calculate the total energy of a sequence.

Instead of being a continuous function, the dependence of the interaction

potential on distance is discretized into bins. With typical bin sizes of 0.5

to 1 Å, the overall protein design model that Klepeis et al. [24,25] devel-

oped implicitly incorporated backbone movements of roughly the same

order of magnitude.

MATHEMATICAL MODELING AND OPTIMIZATION METHODS

Once an energy function has been defined, sequence selection is

accomplished through an optimization-based search designed to mini-

mize the energy objective. Both stochastic and deterministic methods

have been applied to the computational protein design problem. The

Self-Consistent Mean Field (SCMF) [26] and dead-end elimination

(DEE) [27] are both good examples of deterministic methods.

Search WWH ::

Custom Search

Home