Collagen Part 1 (Molecular Biology)

Proteins of the collagen family are the major components of the extracellular matrix and facilitate the formation and maintenance of a multicellular system. Collagens serve as solid-state regulators for cellular function and as scaffolding of the tissue architecture, particularly in large vertebrates. They contain many proline and glycine residues and a distinct secondary structure: the polyproline II-like helix, which is distinct from the a-helix, b-sheet and turn secondary structures in other protein structures. This regular secondary structure arises because the sequences the collagen polypeptide chain consist largely of the repeated sequence Gly—X—Y, with abundant proline residues at the X positions and rich in hydroxyproline (Hyp) residues at the Y positions. Three polyproline II-like helices constitute the supersecondary structure of the collagenous triple helix, which is stabilized through hydrogen bonds nearly perpendicular to triple helical axis (see Triple-Helical Proteins). The collagen protein family includes all the structural proteins of the extracellular matrix with triple-helical collagenous domains in their molecular architecture.

The collagen superfamily is classified into groups (Table 1) according to their molecular and/or supramolecular structures. The molecular structures of the collagenous proteins can be depicted with ball-and-stick models (Fig. 1). The balls represent noncollagenous or globular domains, without abundant glycine and proline residues, while the sticks show the collagenous triple helices. Some of the triple-helical domains, including that of the type IV collagen, have interruptions in the Gly—X


Figure 1. Molecular architecture of collagen superfamily depicted with ball and stick models (stick = collagenous, triple-helical domain; ball = noncollagenous domain).

Molecular architecture of collagen superfamily depicted with ball and stick models (stick = collagenous, triple-helical domain; ball = noncollagenous domain).

Table 1. Collagen Classification —Y triplets, in that glycine residues do not always occupy every third position.

Family

Types of Molecuels or Chains

Fibrillar collagen

Type I, type II, type III, type V, type XI

Meshwork-forming collagen

a1, a2, a3, a4, a5, and a6 chains of type IV

Fibril associated collagen with type interruptedtriple helices (FACIT)

Type IX, type XII, type XIV, type XVI

Collagen with long triple helix

Type VII

Collagen with short triple helix

Type VIII, type X, type VI

Membrane associated collagen

Type XVII

Others

Type XIII, type XV, type XVIII, type XIX

The structure, assembly, and supramolecular aggregation of type I collagen is the prototype from which has developed our understanding of collagenous structure, particularly the fibrillar collagens. Type I collagen is one of the major components of the fibrous collagens that occur in the greatest amounts. The structure and characteristic properties of triple-helical domains have been deduced from the study of type I collagen, together with comparative studies of other types of collagen. Unless otherwise mentioned, the description of collagen triple helices given below is based primarily on the information obtained through the studies type I collagen. These properties are generally shared with the triple-helical domains in other collagen types, especially regarding the characteristic features distinct from a-helix or b structure in noncollagenous proteins. However, most recent studies suggest that the triple-helical regions have structures and properties specific for each type, particularly in their intermolecular interactions. The characteristic features of the various collagen types are due to differences in the distribution of charged residues along the triple helix, content of glycosylated hydroxylysine and bulky hydrophobic residues, as well as imperfection of the Gly—X —Y repeat, which is a prerequisite for the triple-helical conformation.

1. Collagenous Triple Helix; Common and Specific Features among Different Types of Collagen

1.1. Shape and Structure of the Triple-Helical Domains

A variety of physicochemical studies, including direct visualization of individual molecules by rotary shadowing techniques in the electron microscope, have demonstrated the rod-like nature of the collagen triple-helical domain. The molecules of type I collagen have a length of about 300 nm (Fig. 1) and a diameter of about 1.4 nm. The rod is neither rigid nor randomly flexible, but appears to possess an intermediate level of semiflexibility.

Details of the three-dimensional structure of the collagen triple helix were established by model building to fit X-ray fiber diffraction data. Frequent occurrence of proline and hydroxyproline residues (which together account for approximately two-ninths (~22%) of the amino acid residues) favor a polyproline II-like conformation. The axial distance between one amino acid and the next in the polyproline II-like helical structure is 0.286 nm, close to twice that in the a-helix (0.15 nm). The triple-helical structure of the synthetic peptide (Pro—Pro—Gly)10 was also determined by X-ray crystallography. The overall helical symmetry is left-handed, with 10 residues per three turns (108°/residue) or 7 residues per two turns (103°/residue), with a pitch of 2.9 nm. The three polyproline II-like helical chains are further coiled about a central axis, to form a right-handed helix (Fig. 2). The occurrence of glycine as every third residue gives rise to a polymer of repeating tripeptide units with the formula of (—Gly—X—Y—).

Figure 2. Collagenous triple helix, (a) Gly—Pro—Hyp repeated sequence of a chain in a model of part of one a chain in the collagenous triple helix. Both NH and CO groups project perpendicular to the fibrillar axis (C, black; N, blue; O, red; H, gray), (b) Backbone of the collagenous triple helix, (c) Gly—Pro—Hyp trimer {left). Note that there is a groove on the surface of the helix. A part of human type III collagen molecule, [a1(III)]3 in the sequence of GITGARGLAGP (right). Note that all the residues except Gly project to the outer surface of the molecule (C, yellow; N, blue; O, red; H, white).

Collagenous triple helix, (a) Gly—Pro—Hyp repeated sequence of a chain in a model of part of one a chain in the collagenous triple helix. Both NH and CO groups project perpendicular to the fibrillar axis (C, black; N, blue; O, red; H, gray), (b) Backbone of the collagenous triple helix, (c) Gly—Pro—Hyp trimer {left). Note that there is a groove on the surface of the helix. A part of human type III collagen molecule, [a1(III)]3 in the sequence of GITGARGLAGP (right). Note that all the residues except Gly project to the outer surface of the molecule (C, yellow; N, blue; O, red; H, white).

1.2. Thermal Stability of the Triple-Helical Conformation

Flexibility of the collagenous triple helix may vary along the chain. Collagen triple helices of different types have varying flexibility, depending on what residues occupy the X and Y positions. Yet the thermal stability in terms of the helix-to-coil transition temperature (see Helix-Coil Theory) is similar, regardless of the type of collagen, in the same animal or animal tissues. The collagenous domains are heat-stable up to the upper limit of animal body temperature. Heat denaturation of collagenous domains starts around 37°C in mammalian collagens (Fig. 3). The most recent study on the denaturation temperature of the bovine type IV collagen triple-helical domain indicated that the domain also has denaturation temperature above 37°C. This suggests that interruptions in the triplet repeats do not greatly decrease the thermal stability.

Figure 3. Denaturation temperature of collagen triple helix. The temperature was raised stepwise by 1.5°C at intervals of 20 min, and the specific viscosity (hp) was measured. Pepsin-treated acid-soluble collagen from calf skin (0.8 mg/mL) was dissolved in 0.15 M potassium phosphate buffer, pH 6.8, and 1 M glucose. Open circles indicate the specific viscosity of the native collagen solution with increasing temperature; filled circles correspond to the values of the denatured collagen solution when the temperature was lowered the same way as it was raised. The broken line is drawn through the values 5% less in the specific viscosity compared with that expected for the native collagen solution. The triangle (7m) corresponds to the denaturation temperature obtained by the conventional method of taking the midpoint of the curve.

Denaturation temperature of collagen triple helix. The temperature was raised stepwise by 1.5°C at intervals of 20 min, and the specific viscosity (hp) was measured. Pepsin-treated acid-soluble collagen from calf skin (0.8 mg/mL) was dissolved in 0.15 M potassium phosphate buffer, pH 6.8, and 1 M glucose. Open circles indicate the specific viscosity of the native collagen solution with increasing temperature; filled circles correspond to the values of the denatured collagen solution when the temperature was lowered the same way as it was raised. The broken line is drawn through the values 5% less in the specific viscosity compared with that expected for the native collagen solution. The triangle (7m) corresponds to the denaturation temperature obtained by the conventional method of taking the midpoint of the curve.

1.3. Primary Sequence Required for the Triple-Helical Conformation

The characteristic primary structure of polypeptides adopting collagenous triple-helical structures consists of repeats of the sequence Gly—X—Y, where Gly represents glycine and X or Y represents any other amino acid residue. Glycine is the only amino acid that can pack tightly at the center of the triple-stranded collagen monomer, where it provides HN groups for hydrogen bonding to 0=C— groups in the peptide bonds of the other chains (Fig. 4). The stability of the triple-helical conformation is due in part to these hydrogen bonds, which are aligned nearly perpendicular to the helical axis. Interruptions of the Gly—X—Y repeats decrease the stability of the conformation. Substitution of a glycine residue occurring within the sequence Gly—X—Y of triple-helical domains destabilizes and disrupts the helical conformation. In the triple helix, the side chains of all the X and Y residues are exposed on the surface of the triple helix.

Figure 4. A schematic drawing illustrating direct interchain hydrogen bonding between (Gly)NH–CO(Pro in X position) the collagenous triple-helix. The three charms (a, b, and c) with repeated Gly-Pro(x)-Pro(y) sequence, eg, -Ga-Xa-Ya, -C Yb-.

A schematic drawing illustrating direct interchain hydrogen bonding between (Gly)NH--CO(Pro in X position) the collagenous triple-helix. The three charms (a, b, and c) with repeated Gly-Pro(x)-Pro(y) sequence, eg, -Ga-Xa-Ya, -C Yb-.

1.4. Proline Residues at Position X and Hydroxyproline Residues at Position Y

The helical conformation of individual a chains arises largely as a result of steric repulsion between i proline residues in the X position (approximately 120 residues per a1(I)) and 4-hydroxyproline resid the Y position (approximately 100 residues per a1(I) chain) and because the five-membered rings of imino acids are rigid and limit rotation about the peptide N—C bond. The proline and hydroxyprolin residues also stabilize the triple helix. The contribution to helix stability from the pyrrolidine rings of and hydroxyproline is thought to be entropic, in that these residues may not acquire as much freedom denaturation as other residues. Another interpretation for contribution of the pyrrolidine rings to the stability of triple-helical conformation is related to the fact that these side chains are located on the si of the triple helix. Pyrrolidine rings are surprisingly favorable in contact with water. Furthermore, hydroxylation of the proline residues before Gly or Y positions increases the thermal stability greatly although hydroxyproline residues at X positions or after Gly decrease the stability. Whether the hydr group is at the 3 or 4 position of proline residues also influences greatly the thermal stability of the ft helical conformation.

The amino acid sequence responsible for the formation and stability of the collagenous triple helix is susceptible to proteolysis by collagenases. The sequence specifically recognized by bacterial collage usually in sequences such as —GlyXYGlyProYGlyXHypGlyXY—. Cleavage occurs at the amino si Gly residues.

1.5. Resistance of the Triple-Helical Domains to Pepsin or Other Proteinases

The intact triple-helical domain is generally resistant to most proteinases. However, when heated ab( physiological temperatures, it undergoes a helix-to-coil transition and, once melted, becomes suscept degradative enzymes. Protein consisting primarily of collagenous domains stays in solution at acidic the collagenous domains have generally been isolated by pepsin treatment of otherwise insoluble tis a number of triple helices of recently discovered collagen family members, the occurrence of glycine every third residue is occasionally interrupted. The interrupted sites are speculated to lower the stabil the triple helix and might form kinks in the rod-like triple helix. Thus, collagen helix that is otherwis unsusceptible to many proteinases might become susceptible at the interrupted sites of Gly—X—Y i

Thus, type IV collagen may be degraded gradually at the interrupted sites of glycine—X—Y triplets being liberated from the aggregates with pepsin treatment.

1.6. Other Amino Acid Residues at Positions X and Y

All sides chains at positions X and Y protrude out along the surface of the triple helix. Consequently contribute to the hydrophilicity, ionization, hydrophobicity (see Fig. 5), and steric roughness of the p molecular surface, including the size of the groove of the triple helix (Fig. 2). Charged groups, toge with their neighboring sequences, may also affect the stability of the triple helix, presumably due to differential contributions from water of hydration on the surface.

Figure 5. Hydrophobic residues in collagen polypeptide chains. The content of hydrophobic residues in collagenous do (indicated as filled circles) is relatively low compared to the globular proteins. However, they projected to the surface o triple helix, while the globular protein keeps most of the hydrophobic residues inside.

Hydrophobic residues in collagen polypeptide chains. The content of hydrophobic residues in collagenous do (indicated as filled circles) is relatively low compared to the globular proteins. However, they projected to the surface o triple helix, while the globular protein keeps most of the hydrophobic residues inside.

The amino acid residues other than proline or hydroxyproline residues at positions X and Y can prov another classification of the collagen protein family. In general, the collagenous triple-helical domain contain a higher content of basic (arginine + lysine) than of acidic (aspartate + glutamate) residues, r in a basic isoelectric point. Two groups of collagens can be classified on the basis of their relative cc of arginine and lysine: the high arginine group (Arg/Lys > 1) and the low arginine group (Arg/Lys < greater content of Lys residues may result in a greater content of hydroxylysine residues, which furtl provides a possibility for additional glycosylation. A high content of glycosylated hydroxylysine sho contribute greatly to the surface roughness of the triple-helical domains. Another general feature abo amino acid composition of collagens is the low content of hydrophobic amino acid residues. The inte the triple helix is not stabilized by hydrophobic interactions, but they are strong between triple helice because all the hydrophobic residues are exposed on the surface of the triple helix. The content of lar hydrophobic residues in the collagenous triple helices also classifies the collagenous proteins into tw groups. One group contains a high ratio of Ala/hydrophobic amino acids (Val, Leu, Ile, Phe, and Me the other group contains a low ratio.

2. Classification of Collagens Based on the Primary Sequence

In the fibrillar collagens, the triple-helical conformation occurs throughout 95% of the length of the r monomer (Fig. 3). Thus, of the 1057 residues in the a1(I) chain of human collagen, 1014 occur in rej Gly—X—Y triplets. The A-terminal 17 residues and the C-terminal 26 residues (referred to as telopi do not have glycine as every third residue. The type I collagen helical molecule is a heterotrimer com two identical a1(I) chains and one a2(I) chain. The a1(I) and a2(I) chains are very similar, but their p structures are coded by separate genes and are sufficiently different for the chains to be separated by exchange chromatography and by SDS-PAGE. The a-chains each contain over 1000 amino acid resi have molecular weights of approximately 95,000.

The existence of a family of collagenous proteins in the connective tissues of vertebrates was first id when cartilage collagen (type II) was found to be genetically distinct from the type I collagen of skin and tendon. A third collagen (designated type III) was detected in skin. More than 30 different collag polypeptides have been found in the extracellular matrix in the form of at least 19 different collagen (designated from I to XIX; see Table 2).

Table 2. Polypeptide Chain Composition of Genetically Distinct Collagen Types

Type

Known a

chains

Known or Putative Chain Compositionsof Molecules at Present

Length

of Triple-Helical

Domain (nm)

Main or Known Distribution

Aggregate Structure of PurifiedProtein or that

Estimated with Immunohistochemical study, etc.

Fibrillar Collagen

I

a1(I)

[a1(I)]2a2(I)

300

Almost all connective tissues without hyalinecartilage

Fibril

a2(I)

[a1(I)]3?

II

a1(II)

[a1(II)]3

300

Cartilage

Fibril

III

a1(III)

[a1(III)]3

300

Almost similar to type I

Fibril

V

a1(V)

[a1(V)]2a2(V)

300

Almost similar to type I, adult

Fibril

cartilage

a2(V)

a1(V)a2(V)a3(V)

a3(V)

[a1(V)]3?

XI

a1(XI)

a1(XI)a2(XI)a3 (XI)

300

Cartilage

Fibril

a2(XI)

a3(XI)

V/XI

a1(XI)

[a1(XI)]2a2(V)

300

Vitreous body

Fibril

a2(V)

FACIT (Fibril-AssociatedCollagen with Interrupted Triple-Helices)

IX

a1(IX)

a1(IX)a2(IX)a3

Surface of the

Aggregated with fibrillar

(IX)

cartilage fibril

collagen periodicallyon the cartilage collagen fibrils

a2(IX)

a3(IX)

XII

a1(XII)

[a1(XII)]3

Tissues rich in type I

XIV

a1 (XIV)

[a1(XIV)]3

Tissues rich in type I

XVI

a1 (XVI)

[a1(XVI)]3

Others

VI

a1(VI)

a1(VI)a2(VI)a3 (VI)

100

Almost all connective tissues

Beaded microfibril

a2(VI) a3(VI)

[a1(VI)]2a2(VI)

VII

a1(VII)

[a1(VII)]3?

420

Anchoring fibril Short dimer

VIII

a1 (VIII)

[a1(VIII)]3

150

Basement membrane of endothelial cell

Hexagonal array

a2 (VIII)

[a1(VIII)]2a2 (VIII), [a2(VIII)]

X

a1(X)

3

[a1(X)]3

130

Hypertrophic cartilage

Hexagonal array

XIII

a1 (XIII)

[a1(XIII)]3?

XV

a1(XV)

[a1(XV)]3?

150

XVII

a1 (XVII)

[a1(XVII)]3

XVIII

a1 (XVIII)

[a1(XVIII)]3?

XIX

a1

?

FACIT?

(XIX)

Meshwork-Forming Collagen

IV

a1(IV)

[a1(IV)]2a2(IV)

350

Basement membrane

Polygonal meshwork

a2(IV)

a3(IV)a4(IV)a5 (IV), [a3(IV)]2a4

(IV)

Sinusoid

a3(IV)

[a5(IV)]2a6(IV)?

a4(IV)


a5(IV)

a6(IV)

The collagen numbering system (with Roman numerals for each collagen type and Arabic numerals individual a-chains) to some extent reflects the relative abundance of the various collagens, in that g< the more abundant collagens were identified earliest. In addition to these collagens, there exists a nui secreted proteins that contain collagenous amino acid sequences and short triple-helical conformation as the complement component Clq, acetylcholine esterase, lung surfactant protein, conglutinin, seru mannose-binding protein, scavenger receptors (AR-I and AR-II), and MARCO. The collagenous seq in these proteins contribute to their distinctive structures and functions. Since they have no known st role in the extracellular matrix, however, they are not classified as collagens.

From the data derived from amino acid and gene sequencing, collagen molecules can be grouped into groups shown in Figure 1 and Table 1. Fibrillar collagen molecules are characterized by an uninterru helical domain of approximately 300 nm. They are synthesized as procollagens comprised of three p chains that undergo processing to a chains and subsequently assemble into collagen fibrils and fibers Fibrillar collagen molecules (ie, types I, II, III, V, and XI) exhibit several common structural feature: reflect the highly conserved exon-intron structure of the genes.

Polygonal meshwork-forming collagens (type IV collagen polypeptides) have large triple- helical do (>160 kDa) with a length of >350 nm. Their primary structures are characterized by imperfections in —Gly—X—Y— triplet sequence. These interruptions are a particular feature of type IV collagen, in the helical domain contains more than 20 short stretches of non-helix-forming amino acids.

Short triple-helical collagen molecules and (types VI, VIII, X) contain interruptions in the helical do (as in types IX, XII, and XIV). Collagen types VIII and X show remarkable homology and might hav similar roles in tissues. Type XII and type XIV collagens have similarities to type IX collagen in the domain structures. A portion of these triple-helical domains have the potential to interact with fibrilli collagen. Thus, these three types of collagen, plus type XVI, comprises a group of fibril-associated c with interrupted triple helices (FACIT collagens).

An alternative approach for classifying the collagens depends on supramolecular structures that migh related to their physiological function. Individual collagen types may themselves represent a family < of related collagenous structures in the extracellular matrix. Type IV collagen is a family of six homo a chains (a1, a2, a3, a4, a5, and a6), and type V/XI is a family of 6 a chains: a1(V), a2(V), a3(V), a1( (XI) and a3(XI).

Next post:

Previous post: