INTRODUCTION (Protein Folding)

We have interest in protein structure and function at both a fundamental and a practical level. There is astounding beauty in the mastery with which nature has tailored molecules for specific functions, activity levels, regulatory properties, and integration into complex macromolec-ular assemblies. As will be discussed, in most cases, these molecules assume a final stably folded structure spontaneously. Thus, all of the information necessary for biological activity is contained in the simple sequence of amino acids as encoded by the DNA. Practically speaking, predicting protein structure, stability, and function from the primary sequence will open myriad opportunities in the areas of medicine (e.g., drug discovery and understanding molecular basis of disease), industry and manufacturing (e.g., biocatalysis and bioprocessing), and the environment (e.g., bioremediation).

Proteins are linear polymers of amino acids that are linked through amide linkages, commonly called the peptide bond. The "backbone" atoms include the amide linkages separated by a carbon that is derivatized by any one of 20 common side chains. The side chains may be grouped at neutral pH as acidic, basic, hydrophobic, and uncharged hydrophilic according to their chemical nature. Thus, although the backbone of the peptide polymer is a repeating identical unit, the side chains and their distinct properties dictate the nature of the protein. Because a subset of the amino acid side chains is charged at neutral pH (acidics are negative and basics are positive), the protein polymer is a polyelectrolyte. The linear sequence of amino acids is called the primary structure of the protein (Fig. 1). The primary structure dictates the way in which the polypep-tide folds into a functional protein, in most cases without instructions from other sources.

Protein families are proteins related by structure or function. A protein family may be structurally diverse but have a particular cluster of amino acids at the active site that defines the class according to some catalytic function (e.g., dehydrogenases and kinases). Alternatively, proteins may have a structural motif that defines the class (e.g., helix-loop-helix motif of the EF-hand calcium-binding proteins). Proteins with identical function in different organisms often have slightly different primary structures (see below). The presence of certain amino acids relative to others in primary sequences allows putative protein sequences from the Human Genome Project, for example, to be classified into general protein families. Whether this initial classification is valid remains to be seen.

To discover the rules of protein folding, two major approaches have emerged: computational and empirical approaches. The computational approach, often termed pro-teonomics, attempts to predict the structure of a protein based on its sequence by defining a set of rules and criterion for their application. This topic is covered elsewhere in this series. The empirical approach to discovering the rules of protein folding defines global rules for folding based on lessons learned from particular proteins. These two methods are distinctly interwoven.3 Hypotheses derived from one are testable through the other. In this paper, we will discuss the empirical approach to studying protein folding.

The empirical approach to understanding protein folding has relied heavily on mutational analysis. As mentioned earlier, proteins from different species with identical functions may have slightly different amino acid sequences, or mutations. Often the mutations are conservative, particularly in amino acids that are critical to the structure or function of the protein. Scientists study the different physical properties of these related proteins to gain insight into the role of amino acids in local or global structure and function of the protein. Often mutations are purposely engineered into protein sequences using molecular biological techniques to test hypotheses about roles of certain amino acids in structure or function. Selective substitution of tryptophan into a sequence allows placement of a convenient spectroscopic probe (see below).

Although proteins are very diverse, the one thing that almost all have in common is that they adopt spontaneously a unique and stable tertiary structure. This is an utter miracle of nature given the complexity of these heterogeneous polymers. The study of protein folding is focused on understanding the rules that govern the transition into and the stability of this unique fold. The transition into the tertiary structure is studied by kinetic methods. Thus, kinetic studies ask the question, "By what pathway is the final tertiary structure folded?" Alternatively, equilibrium thermodynamic methods ask "How stable is the final fold and why?" Each of these approaches will be discussed individually.

Next post:

Previous post: