what-when-how
In Depth Tutorials and Information
CHAPTER
13
Sequence Alterations in the Carboxyl-
Termi
nal Propeptide Do
main
Fransiska Malfait, Sofie Symoens and Anne De Paepe
Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
INTRODUCTION
of the fibrillar procollagens are highly conserved -
much greater variability is seen in the N-propeptides
9
-
and overall sequence homology among C-propeptide
domains from different fibrillar collagens is strong
(46% identity among human procollagens types I-III).
8
Despite a high level of similarity, these pro-α chains need
to assemble in a procollagen type-specific manner. This is
of critical relevance in cells that co-express different pro-
collagens, such as, for example, human skin fibroblasts,
which co-express six highly homologous but genetically
distinct procollagen chains necessary for the assembly of
fibrillar collagens type I, III and V.
Proper chain selection ensures that each type I pro-
collagen molecule has the correct composition of two
pro-α1(I) and one pro-α2(I) collagen chain. The ability to
discriminate between chains resides within the primary
sequence of the C-propeptide and is, at least in part,
ascribed to the “chain recognition sequence” (CRS), a
highly variable discontinuous sequence of 15 residues, in
the approximate center of the C-propeptide (see
Figure
13.3
).
10
Exposure of this chain recognition sequence is
dependent on correct folding of the C-propeptide of
each chain into a structure that is stabilized by intra-
chain disulfide bonds.
11
After chain discrimination, three
C-propeptides assemble and promote correct registration
of the triple helical domain. Presumably, the procollagen
chains first associate through a series of non-covalent
interactions between the C-propeptide domains to form
a trimer, which is stabilized by the formation of inter-
chain disulfide bonds.
12,13
This process is facilitated by
interactions with endoplasmic reticulum-resident molec-
ular chaperones, including immunoglobulin heavy-
chain binding protein (BiP/GRP78), Serpin H1 and the
prolyl 3-hydroxylation complex.
14-17
The triple helix is
Type I collagen is synthesized as a soluble precursor
molecule, procollagen, which consists of two pro-α1(I)
chains and one pro-α2(I) collagen chain, encoded by the
COL1A1
(OMIM 120150) and
COL1A2
(OMIM 120160)
genes, respectively. Each pro-α chain contains a central
triple helical domain of more than 1000 residues, consist-
ing of a repeating Gly-Xaa-Yaa sequence, in which Xaa
and Yaa are any residue other than cysteine or trypto-
phan, as well as two large globular extensions, propep-
tides, at the amino (N-) and carboxyl (C-) termini. Inside
the cell, association of the C-propeptide domains initiates
assembly of three polypeptide chains, providing a crucial
step in the correct nucleation and folding of the procol-
lagen molecules.
1-3
The human pro-α1(I) and pro-α2(I)
C-propeptides comprise 246 and 247 residues, respec-
tively, which lack the obligate helical Gly-Xaa-Yaa repeat
sequence that is characteristic for the triple-helix domain.
They contain eight and seven cysteine residues, respec-
tively, which are highly conserved, consistent with their
crucial role in the initial stages of procollagen assembly.
4
Peptide mapping of the CNBr-cleaved C-propeptide
domains of type I procollagen has demonstrated the
presence of intrachain disulfide bonds between cys 5
and cys 8, and between cys 6 and cys 7.
5
Although
it was assumed that the first four (for the pro-α2(I)
C-propeptide the first three) cysteine residues are
involved in interchain disulfide bonding,
6,7
it has
recently become clear, based on the crystal structure
of the pro-α1(III) C-propeptide, that cys 1 and cys 4 are
also involved in intrachain disulfide bonding, leaving
cys 2 and cys 3 as the only ones involved in interchain
disulfide bond formation.
8
The C-propeptide domains