Biomedical Engineering Reference
In-Depth Information
and of alternative, competing structures may be quite different. Sequences with
the same ground state that can be evolutionarily interconnected through a point
mutation are neighbors on the neutral network. We define the neutral fraction z . S /
as the fraction of point mutations of sequence S that do not change the ground
state. If the average neutral fraction in the neutral network, z . S /, is above a critical
threshold, random graph theory predicts that the neutral network is connected,
whereas below the threshold the neutral network is formed by a giant connected
component and several small components. For the four-letter AUGC alphabet, the
neut ral fraction of some tRNA-like structures were computed and shown to be
z . S / 0:29, slightly smaller than the critical threshold z cr D 0:37 [ 2 ], so that
the giant component dominates the neutral network. Furthermore, it was shown that
the neutral networks of any two common structures are close in sequence space
in the sense that there are sequences for which both structures are the ground
state (i.e., have very similar low energy) that allow to connect the two neutral
networks [ 2 ]. These studies highlighted the importance of neutrality for molecular
adaptation [ 4 ]. We note, however, that such multiconformational sequences would
not be observed in evolution if selection required that the target structure must be
sufficiently stable against misfolded conformations.
We now turn from RNA to proteins, which is the main focus of this review
chapter. Motivated by the earlier work on RNA, proteins have been the subject of
intense investigation from several groups since the late 1990s [ 5 - 14 ]. In this case,
there exists no approach to reliably predict the lowest energy structure of a protein
sequence, and we must resort to approximations. It is common to represent a protein
structure as a contact matrix whose element C ij equal 1 if the residues at sites i
and j are close in the three-dimensional folded structure and zero otherwise. This
representation is formally similar to the secondary structure representation of RNA.
However, in the latter case, each site can interact at most with another site, whereas
in proteins each site has multiple contacts. It has been shown that the contact matrix
is sufficient to reconstruct the whole three-dimensional structure of the protein with
very high accuracy [ 15 ]. In this context, one usually assumes that the free energy of
a protein with sequence A folded into the contact matrix C is given by the sum of
pairwise contact interactions,
X
E. A ; C / D
C ij U.A i ;A j /;
(1)
ij
where U.a; b/ is the contact interaction matrix that expresses the free energy gained
when amino acids a and b are brought in contact. In most of the results reported
here, the matrix determined in Bastolla et al. [ 16 ] has been used. For proteins that
fold with two-state thermodynamics, i.e., for which only the native structure and the
unfolded structure are thermodynamically important, stability against unfolding is
defined as the free energy difference between the folded and the unfolded state, and
it can be estimated as G 2 E. A ; C nat / C sL,where C nat is the native structure,
L is protein length, and s D 0:074 is an entropic parameter that was determined
by fitting the above equation to a set of 20 experimentally measured unfolding
Search WWH ::




Custom Search