Cuticular Proteins (Insect Molecular Biology) Part 5

Modeling of Cuticular Proteins

CPR protein models Secondary structure prediction and experimental data summarized above (see sections 5.5.1 and 5.5.2) indicated that P-pleated sheet is most probably the underlying molecular conformation of the members of the CPR family, and that this conformation is most probably involved in P-sheet/chitin-chain interactions of the cuticular proteins with the chitin filaments (Iconomidou et al., 1999, 2001). Can this information be translated into a three-dimensional model?

Unexpectedly, distant sequence similarities of the extended R&R Consensus from several CPR proteins were found with a lipocalin, bovine plasma retinol-binding protein (RBP) (Hamodrakas et al., 2002). Lipocalins are members of a family of extracellular proteins, typically small (160-200 residues), with low sequence similarity among family members (frequently < 20%). They exhibit several common molecular recognition properties, and, while they were classified mainly as transport proteins, it is now clear that they have various functions (Flower, 1996). The lipocalin fold is a highly symmetrical all-P structure dominated by a single eight-stranded antiparal-lel up-and-down P-sheet barrel (Flower et al., 2000). Fairly recently, it was found that lipocalins are characterized by two hydrophobic "clusters" of residues, the "inner" and the "outer" clusters (Adam et al., 2008).

The first attempt utilized HCCP12, an RR-1 protein leading to a construction of a structural model that corresponds to the "extended R&R Consensus" (Hamodrakas et al., 2002). The original model (Figure 4A) comprises the C-terminal 66 residues (out of 89 in total) of HCCP12, and has many advantages since it corresponds to the full sequence of the "extended R&R Consensus" (see section 5.3.2.2). This work was extended to RR-2 proteins, leading to comparable results, as shown for AGCP2b (Agam-CPR97) in Figure 4D (Iconomidou et al., 2005).


Low-resolution docking experiments of an extended N-acetylglucosamine tetramer to the model of HCCP12, utilizing the docking program GRAMM (Vakser, 1996), revealed that the proposed model for cuticle proteins accommodates, perpendicularly to the half-barrel P-strands, at least one extended chitin chain (Figure 4A) (Hamodrakas et al., 2002).

Homology modeling results indicate that the basic structural motif of the CPR family is an antiparallel P-sheet structure with a "cleft" full of conserved aromatic residues that form "flat" hydrophobic surfaces on one "face," perfectly positioned to stack against faces of the saccharide rings of chitin. One unpredicted feature in the model is a short two-turn a-helix at the C-terminus of the extended R&R Consensus. This C-terminal part of the model is reminiscent in some respects of the chitin-binding domain of an invertebrate chitin-binding lectin, a two stranded P-sheet followed by a helical turn (Suetake et al., 2000). More detailed docking experiments (Iconomidou et al., 2005), utilizing GRAMM (Vakser, 1996), showed that chitin protein chains may run parallel to the P-strands of the half-P-barrel (Figures 4B, 4C). Thus, P-barrels of cuticle proteins may intervene between the long chitin chains in cuticle without disrupting continuity. This parallel arrangement of cuticle protein P-strands with the chitin chains agrees with observations made by Atkins, over 20 years ago (Atkins, 1985), from X-ray diffraction patterns.

The inherent twist of the half-barrel P-sheet of the cuticle proteins and its observed packing arrangement at an angle with the chitin chains may provide a molecular basis for the morphological observation of a helicoidal twist in cuticle. These models were also subjected to analysis (Iconomidou et al., 2005) of the positions of histidine residues, since they might play a role in cuticle sclerotization (Neville, 1975; Andersen et al., 1995; Andersen, 2005) and appear to be very conserved in RR-2 sequences.

The general remarks that arise from the analysis are that histidines are positioned "exposed" either in turns or at the edges of the half-barrel or its periphery, permitting interactions with chitin involved this way in cuticle sclerotization (Figure 4D). Alternatively, they could be involved in the variations of the water-binding capacity of cuticle and the interactions of its constituent proteins, because small changes of pH can affect the ionization of their imidazole groups (Andersen et al., 1995).

These observations are in excellent agreement with the predictions made several years ago for the role of histi-dines from secondary structure predictions (Iconomidou et al., 1999), and strengthen further the value of the models previously proposed for CPR proteins (Hamodrakas et al., 2002).

CPF protein models The next obvious step involved attempts at elucidating the structural motifs and possible functions of cuticular proteins that belong to families where the "extended R&R Consensus" is absent. An appropriate choice was the CPF family of cuticular proteins (see section 5.3.2.3) (Togawa et al., 2007). This family of cuticular proteins is of particular interest because they are expressed just before pupal or adult ecdysis, suggesting that these families are most probably components of the outer layer of pupal and adult cuticles -that is, they are likely located in the epi- or exo-cuticle. Actually, the epicuticle is one cuticular region that lacks chitin, suggesting that the CPF family of proteins may interact with components of the cuticle other than chitin.

Similarly to CPR proteins, members of the CPF family share significant sequence similarity to the crystallo-graphically solved structure of bovine retinol-binding protein (RBP), which belongs to the class of lipocalins. The models of two proteins, AgamCPF3 from Anopheles gambiae and a CPF homolog, CG8541, from Drosophila melanogaster, were constructed based on this similarity (Papandreou et al., 2010). The derived models (Figures 5A and 5B) indicate that the basic folding motif of CPFs is most probably an antiparallel, up-and-down, P-sheet full-barrel structure, unlike the proposed half-barrel for the CPR family.

The next step involved a high-resolution experiment, utilizing GRAMM (Vakser, 1996), of the proposed model with a NAG tetramer (Papandreou et al., 2010). The results (Figure 6B) indicated that the tetramer does not fit into the binding pocket of the CPFs; rather, the CPFs might interact loosely with chitin chains, with their P-strands lying parallel to the chitin chains, in agreement with experimental observations (Atkins, 1985). Further evidence against a role of the CPFs in direct binding to chitin comes from failure of recombinant CPF proteins to bind to chitin (Togawa et al., 2007). Comparative structural information in the paper by Papandreou et al. (2010) also indicates that carbohydrates should not bind in the pocket. Protein-carbohydrate interactions involve aromatic residues, and in the cleft of the half-barrel model of HCCP12 are three critical aromatic residues, (Hamodrakas et al., 2002); the model of HCCP66 has four aromatic residues in its cleft (Iconomi-dou et al., 2005), whereas comparable residues in the two CPFs, AgamCPF3 and DmelCG8541, are hydrophobic but not aromatic.

Ribbon models of cuticular proteins derived from homology modeling. (A) A ribbon model of cuticle protein structure, displayed using GRASP (Nicholls et al., 1991). The structure of the representative RR-1 cuticle protein HCCP12 was modeled on that of bovine retinol-binding protein (RBP; PDB code 1FEN) (Zanotti et al., 1994), utilizing the program WHAT IF (Vriend, 1990). Further details are in Hamodrakas et al. (2002). The side chains of several aromatic residues are shown and numbered, following the numbering scheme of the unprocessed HCCP12 sequence. The model structure has a "cleft" full of aromatic residues, which form "flat" surfaces of aromatic rings (upper side), ideally suited for cuticle protein-chitin chain interactions, and an outer surface (lower side), which should be important for protein-protein interactions in cuticle. The model is a complex of HCCP12 with an N-acetyl glucosamine (NAG) tetramer in an extended conformation. The complex was derived from a "low-resolution" docking experiment of a NAG tetramer, in an extended conformation, with the model of HCCP12, utilizing the docking program GRAMM (Vakser, 1996) and the default parameters of the program. (B) and (C) Two more possible complexes of HCCP12 with a NAG tetramer in an extended conformation derived from a "high-resolution" docking experiment, utilizing the program GRAMM (Vakser, 1996) and the default parameters of the program for "high resolution." The two models presented in (B) and (C) are the two "top on the list," most favorable complexes, whereas third on the list is a structure similar to that of (A). The one in (B) has the NAG tetramer more or less parallel to the last P-strand of the HCCP12 half P-barrel model, whereas that in (C) has the NAG tetramer more or less parallel to the first P-strand of the HCCP12 half P-barrel model. Note that in both (B) and (C) the chitin chain runs parallel to the P-strands, whereas in (A) the chain is arranged perpendicular to the P-strands. (D) A display of a model of the RR-2 protein AGCP2b. The numbering is that of the unprocessed protein. Histidine (H) side chains are shown as "ball and sticks," in red, with their corresponding numbering.

Figure 4 Ribbon models of cuticular proteins derived from homology modeling. (A) A ribbon model of cuticle protein structure, displayed using GRASP (Nicholls et al., 1991). The structure of the representative RR-1 cuticle protein HCCP12 was modeled on that of bovine retinol-binding protein (RBP; PDB code 1FEN) (Zanotti et al., 1994), utilizing the program WHAT IF (Vriend, 1990). Further details are in Hamodrakas et al. (2002). The side chains of several aromatic residues are shown and numbered, following the numbering scheme of the unprocessed HCCP12 sequence. The model structure has a "cleft" full of aromatic residues, which form "flat" surfaces of aromatic rings (upper side), ideally suited for cuticle protein-chitin chain interactions, and an outer surface (lower side), which should be important for protein-protein interactions in cuticle. The model is a complex of HCCP12 with an N-acetyl glucosamine (NAG) tetramer in an extended conformation. The complex was derived from a "low-resolution" docking experiment of a NAG tetramer, in an extended conformation, with the model of HCCP12, utilizing the docking program GRAMM (Vakser, 1996) and the default parameters of the program. (B) and (C) Two more possible complexes of HCCP12 with a NAG tetramer in an extended conformation derived from a "high-resolution" docking experiment, utilizing the program GRAMM (Vakser, 1996) and the default parameters of the program for "high resolution." The two models presented in (B) and (C) are the two "top on the list," most favorable complexes, whereas third on the list is a structure similar to that of (A). The one in (B) has the NAG tetramer more or less parallel to the last P-strand of the HCCP12 half P-barrel model, whereas that in (C) has the NAG tetramer more or less parallel to the first P-strand of the HCCP12 half P-barrel model. Note that in both (B) and (C) the chitin chain runs parallel to the P-strands, whereas in (A) the chain is arranged perpendicular to the P-strands. (D) A display of a model of the RR-2 protein AGCP2b. The numbering is that of the unprocessed protein. Histidine (H) side chains are shown as "ball and sticks," in red, with their corresponding numbering.

(A) A ribbon model of the cuticular protein AgamCPF3 structure (green), displayed using the software PyMOL (Delano, 2005). The model was modeled on that of bovine retinol-binding protein (RBP; PDB code 1FEN (Zanotti et al., 1994) utilizing the software Modeller v9.2 (Sali and Blundell, 1993). The entire secreted protein, from A1 to W121, is shown in the model. It is complexed with 7(Z),11(Z)-heptacosadiene (7,11-HD), shown in red. The complex was derived from a docking experiment of 7,11-HD, with the model of AgamCPF3, utilizing the docking software Autodock4.2 (Morris et al., 2009). The ligand is inside the "pocket" of the P-barrel of AgamCPF3. The ligand was considered as rigid, in its minimum energy conformation. The ligand represents a cluster of 4 out of 10 best solutions (runs). (B) A ribbon model of the CPF protein DmelCG8541 structure (green), constructed and displayed as in Figure 5(A). The model comprises 190 of 257 residues of the secreted protein, from Y43 to S232. It is complexed with 7,11-HD, shown in blue. Details of the docking experiment that produced this complex are as in Figure 5(A). The ligand represents a cluster of 7 out of 10 best solutions (runs). (C) A ribbon model of the cuticular protein AgamCPF3 structure (green), constructed and displayed as in Figure 5(A). The entire secreted protein, from A1 to W121, is shown in the model. The complex was derived from a docking experiment of 7,11-HD, with the model of AgamCPF3, utilizing the docking software Autodock4.2 (Morris et al., 2009). Two out of 10 best solutions (runs) for the ligand are shown in red and blue, respectively, inside the "pocket" of the P-barrel of AgamCPF3. The remaining eight solutions also show the ligand to reside inside the "pocket." The 7,11-HD ligand was considered as flexible (all rotable bonds were set free). (D) A ribbon model of the cuticular protein DmelCG8541 structure (green), constructed and displayed as in Figure 5(B). The model comprises 190 of 257 residues of the secreted protein, from Y43 to S232. It is complexed with 7,11-HD. All other details of the docking experiment that produced the complex are as in Figure 5(A). Two out of 10 best solutions (runs) for the ligand are shown in magenta and blue, respectively, inside the "pocket" of the P-barrel of DmelCG8541. The remaining eight solutions also show the ligand to reside inside the "pocket."

Figure 5 (A) A ribbon model of the cuticular protein AgamCPF3 structure (green), displayed using the software PyMOL (Delano, 2005). The model was modeled on that of bovine retinol-binding protein (RBP; PDB code 1FEN (Zanotti et al., 1994) utilizing the software Modeller v9.2 (Sali and Blundell, 1993). The entire secreted protein, from A1 to W121, is shown in the model. It is complexed with 7(Z),11(Z)-heptacosadiene (7,11-HD), shown in red. The complex was derived from a docking experiment of 7,11-HD, with the model of AgamCPF3, utilizing the docking software Autodock4.2 (Morris et al., 2009). The ligand is inside the "pocket" of the P-barrel of AgamCPF3. The ligand was considered as rigid, in its minimum energy conformation. The ligand represents a cluster of 4 out of 10 best solutions (runs). (B) A ribbon model of the CPF protein DmelCG8541 structure (green), constructed and displayed as in Figure 5(A). The model comprises 190 of 257 residues of the secreted protein, from Y43 to S232. It is complexed with 7,11-HD, shown in blue. Details of the docking experiment that produced this complex are as in Figure 5(A). The ligand represents a cluster of 7 out of 10 best solutions (runs). (C) A ribbon model of the cuticular protein AgamCPF3 structure (green), constructed and displayed as in Figure 5(A). The entire secreted protein, from A1 to W121, is shown in the model. The complex was derived from a docking experiment of 7,11-HD, with the model of AgamCPF3, utilizing the docking software Autodock4.2 (Morris et al., 2009). Two out of 10 best solutions (runs) for the ligand are shown in red and blue, respectively, inside the "pocket" of the P-barrel of AgamCPF3. The remaining eight solutions also show the ligand to reside inside the "pocket." The 7,11-HD ligand was considered as flexible (all rotable bonds were set free). (D) A ribbon model of the cuticular protein DmelCG8541 structure (green), constructed and displayed as in Figure 5(B). The model comprises 190 of 257 residues of the secreted protein, from Y43 to S232. It is complexed with 7,11-HD. All other details of the docking experiment that produced the complex are as in Figure 5(A). Two out of 10 best solutions (runs) for the ligand are shown in magenta and blue, respectively, inside the "pocket" of the P-barrel of DmelCG8541. The remaining eight solutions also show the ligand to reside inside the "pocket."

 (A) A ribbon model of the cuticular protein AgamCPF3 structure (green), constructed and displayed as in Figure 5(A). The entire secreted protein, from A1 to W121, is shown in the model. The complex was derived from a docking experiment of 7,11-HD (shown in cyan) with the model of AgamCPF3, utilizing the docking software Autodock4.2 (Morris et al., 2009). It shows the ligand, outside the P-barrel of AgamCPF3, in contact with the "hydrophobic outer cluster" (see Table 1 in Papandreou et al., 2010). The side chains of three hydrophophic residues of the conserved "hydrophobic outer cluster," Y2, V83, and Y119, are shown. The ligand was considered as rigid, in its minimum energy conformation. The ligand represents the cluster of the remaining 6 out of 10 best solutions (see Figure 5(A)). (B) A complex of AgamCPF3 (ribbon model shown in green) with a NAG tetramer (ball and stick model) in an extended conformation (taken as a chitin analog). The complex was derived from a "high resolution" docking experiment, utilizing the docking software GRAMM (Vakser, 1996) and the default parameters of the program, displayed using PyMol (Delano, 2005). The model presented is the "top of the list," most favorable complex. Note that the "chitin chain" runs parallel to the P-strands, of at least half of the P-barrel, in agreement with experimentally derived data (Atkins, 1985). No solution was obtained with the "chitin chain" into the pocket of the P-barrel. The entire secreted protein, from A1 to W121, is shown in the model.

Figure 6 (A) A ribbon model of the cuticular protein AgamCPF3 structure (green), constructed and displayed as in Figure 5(A). The entire secreted protein, from A1 to W121, is shown in the model. The complex was derived from a docking experiment of 7,11-HD (shown in cyan) with the model of AgamCPF3, utilizing the docking software Autodock4.2 (Morris et al., 2009). It shows the ligand, outside the P-barrel of AgamCPF3, in contact with the "hydrophobic outer cluster" (see Table 1 in Papandreou et al., 2010). The side chains of three hydrophophic residues of the conserved "hydrophobic outer cluster," Y2, V83, and Y119, are shown. The ligand was considered as rigid, in its minimum energy conformation. The ligand represents the cluster of the remaining 6 out of 10 best solutions (see Figure 5(A)). (B) A complex of AgamCPF3 (ribbon model shown in green) with a NAG tetramer (ball and stick model) in an extended conformation (taken as a chitin analog). The complex was derived from a "high resolution" docking experiment, utilizing the docking software GRAMM (Vakser, 1996) and the default parameters of the program, displayed using PyMol (Delano, 2005). The model presented is the "top of the list," most favorable complex. Note that the "chitin chain" runs parallel to the P-strands, of at least half of the P-barrel, in agreement with experimentally derived data (Atkins, 1985). No solution was obtained with the "chitin chain" into the pocket of the P-barrel. The entire secreted protein, from A1 to W121, is shown in the model.

Therefore, the questions that arise are, what is the functional role of the CPF proteins, and what fits within the cavity of the barrel? One possible function is that they intercalate among the chitin crystallites and chitin-binding proteins of the procuticle. However, this does not explain why they should form a binding pocket. Alternatively, if CPFs are components of the epicuticle, they could perhaps bind, as lipocalins do, to the lipoidal molecules, which are known to act as female contact sex pheromones in certain insect species (Antony and Jallon, 1982; Antony et al., 1985) and are primarily located in the epicuticle (Andersen, 1979). We attempted to dock 7(Z), 11(Z)-heptacosadiene (7,11-HD), the predominant female-specific sex pheromone of D. melanogaster (Antony et al., 1985), to the derived models of the D. melanogaster CPF protein, CG8541, utilizing GRAMM (Vakser, 1996) and Autodock4.2 (Morris et al., 2009). The pheromone was considered both as rigid and flexible. Docking results showed that this interaction is possible, indeed energetically favorable, and that 7,11-HD could fit into the binding pocket of the P-barrel or in the outer hydrophobic cluster (Figures 5B, D).

Complex formation between AgamCPF3 and 7,11-HD is also favored, although the molecular nature of sex pheromones in An. gambiae, if they exist, remains unknown, suggesting that a similar structure could easily bind to AgamCPF3, either inside the pocket (Figures 5A, C) or outside (Figure 6A). Microarray analyses have found significantly different levels of CPF3 transcript in adults of the incipient species M and S, and within the same form following a blood meal or in response to mating (Cassone et al., 2008; Marinotti et al., 2006; Rogers et al., 2008). On the other hand, it is surprising that an epicuticular component would continue to be made and secreted into outer regions of the cuticle days after adult eclosion. An alternative occupant of the CPF binding pocket might just be intracuticular lipids that are present throughout the cuticle. Several of these cuticular lipids have chemical structures very similar to 7,11-HD (Hadley, 1981). Therefore, they would fit easily into the pocket of the P-barrel of the CPFs, or bind to their "outer hydrophobic cluster" (Figures 5, 6A).

Why do the proposed models correspond to a half-barrel model for CPRs and a full barrel for CPFs? The CPR Consensus region alone (< 70 aa) was used, for that is the region of the protein that matches closely to retinol-binding protein; it is far too short to form a full barrel. By contrast, the CPF match is far longer, and compatible with a full barrel.

Fusion Proteins Establish a Role for the Extended R&R Consensus

Predictions of secondary and tertiary structure and experimental evidence supporting them (discussed above in sections 5.1-5.3) established that the extended R&R Consensus has the properties to serve as a chitin-binding motif. In particular, the planar surfaces of the predicted P-sheets will expose aromatic residues positioned for protein-chitin interaction. The ultimate test of these predictions would be to show that the extended consensus region is sufficient to confer chitin binding on a protein.

Rebers and Willis (2001) investigated this possibility by creating fusion proteins using the extended R&R Consensus from the An. gambiae putative cuticular protein, AGCP2b (Dotson et al., 1998; now annotated as AgamCPR97). First, they expressed this protein in E. coli and isolated it from cell lysates. The construct used coded for the complete protein minus the predicted signal pep-tide, and had a histidine tag added to the N-terminus to facilitate purification. AGCP2b is a protein of 222 amino acids, with an RR-2 type of consensus. The purified protein bound to chitin beads, and could be eluted from these beads with 8M urea or boiling SDS. This established unequivocally that AGCP2b was a chitin-binding protein. Chitin binding previously had been obtained with mixtures of protein extracted from cuticles of two beetles and D. melanogaster (Hackman, 1955; Fristrom et al., 1978; Hackman and Goldberg, 1978).

The next, and essential, step was to create a fusion protein uniting a protein that did not bind to chitin with the extended R&R Consensus region. Such a fusion was created between glutathione-S-transferase (GST) and 65 amino acids for AGCP2b – covering the region of pfam00379, the extended R&R Consensus:

APANYEFSYSVHDEH TG DIKSQHETRH-GDEVH G Q Y S L L DSD G H QRI V D -YHADHHTGFNA VVRREP

GST and the fusion protein were each affinity purified using a glutathione-sepharose column. GST alone did not bind to chitin but the fusion protein did, requiring denaturing agents for release.

Other experiments defined in more detail the requirements for converting GST into a chitin-binding protein. A shorter fragment of AGCP2b, 40 amino acids (underlined above), with the strict R&R Consensus (shown in italics) did not bind chitin. Nor did the full construct when either the Y and F (highlighted) of the strict R&R Consensus or the T and D (bolded) of the extended consensus were "mutated" to alanine (Rebers and Willis, 2001).

In addition to establishing a function of the extended R&R Consensus, the experiments with "mutant" forms also provided confirmation of key elements in the models discussed in section 5.5.3. Substitution of the two conserved aromatic residues, postulated to be contact points with chitin, abolished chitin binding. With the TD "mutations," alanines were substituted for two other conserved residues. These flank a glycine that is conserved in position in the "extended consensus" of all hard and many soft cuticles (Iconomidou et al., 1999). According to the proposed model (Figure 4A), these two polar residues would point away from the hydrophobic "cleft" and thus should not participate in chitin binding. It should be noted, however, that this glycine is located at a sharp turn, at the end of the second P-strand (in the vicinity of H102 of Figure 4D). The substitution of two polar residues by two alanines may result in destruction of this turn and to improper folding, thus leading to a structure not capable of binding chitin.

These experiments established, at last, that the extended R&R Consensus is sufficient to confer chitin-binding properties on a protein, and thereby resolved years of speculation on the importance of this region. Since then, comparable experiments have been done with other proteins in the CPR family, both RR-1 and RR-2 forms; all confirm that the extended R&R Consensus can bind chi-tin (Togawa et al., 2004, 2007; Qin et al, 2009).

Members of Other Cuticular Protein Families Analyzed for Chitin Binding

Data are now available that identify members of other CP families as capable of binding chitin. Most notable was the finding that a recombinant BmorCPT1 bound to chitin (Tang et al., 2010). Given that both CPAP1 and CPAP3 families have ChtBD2 domains (see section 5.3.2.11), it is expected that their members will also bind chitin, but this has only been demonstrated experimentally for a recombinant form of the gasp homolog, a member of the CPAP3 family, from Choristoneura fumiferana (Nisole et al, 2010).

In contrast, using the same methodology, Togawa et al. (2007) failed to demonstrate that either AgamCPF1 or AgamCPF3 could bind to chitin. While the CPR proteins are easily purified after expression in E. coli, the CPF proteins required use of the Pierce Refolding Kit® for proper solubilization. Therefore, their failure to bind could be due to improper refolding, although the information from homology modeling is consistent with a lack of chi-tin binding (see section 5.5.3.2).

Chitinase, some lectins and proteins from peritrophic membranes all bind chitin (for review, see Shen and Jacobs-Lorena, 1999). What is unique about the extended R&R Consensus and members of the TWDL family is that they lack cysteine residues. These residues serve essential roles in the other types of chitin-binding proteins, forming disulfide bonds that hold the protein in the proper configuration for binding. While these other chitin-binding proteins have weak sequence similarities to one another, they do not approach the sequence conservation seen in the R&R Consensus throughout the arthropods, or the TWDL consensus in the groups in which it is found. Rebers and Willis (2001) suggested that the conservation of the R&R Consensus (shown in Figure 1) could well be due to the need to preserve a precise conformation of the chitin-binding domain in the absence of stabilizing disulfide bonds, and the same reasoning could now be applied to the TWDL sequences where consensus regions are evident (Figure 2A).

Summary of Interaction Studies

Four different types of data have been presented in section 5.5 analyzing the extended R&R Consensus: secondary structure predictions of anti-parallel P-sheets (section 5.5.1), experimental spectroscopic evidence from cuticles and cuticle extracts for the predominance of such P-sheets in cuticular protein conformation (section 5.5.2), models showing organization of the consensus into a half P-barrel with a groove that can accommodate chitin (section 5.5.3.1), and direct demonstration that the extended consensus is sufficient to confer chitin binding on a protein (section 5.5.4). These four types of data are all in agreement that the highly conserved amino acid sequence of the extended R&R Consensus forms a novel chitin-bind-ing domain, albeit one that displays an essential feature of other proteins that interact with chitin – namely, the presentation of aromatic residues in a planar surface. Crystal structures of the cuticular protein-chitin complex are needed to assure that these inferences are correct.

Summary and Future Challenges

This topic has summarized the wealth of information about cuticular proteins amassed since Silvert’s review in 1985. Most striking is that the several hundred-fold increase in sequences for structural cuticular proteins has revealed that the majority have a conserved domain (pfam00379) that is an extended version of the R&R Consensus. We now know that proteins with the R&R Consensus interact with chitin, and we can predict in some detail the features of their sequence that confer this property. We have not yet begun to analyze how the regions outside the Consensus contribute to cuticular properties. There is also direct experimental evidence that a member of the TWDL family also binds chitin. However, we have yet to learn how proteins from other families contribute to cuticle structure, or how members of the different families interact with one another and other constituents of the cuticle.

Cuticular protein transcripts are turning up as major indicators of differential gene expression in analyses of insecticide resistance (Vontas et al., 2007; Zhang et al., 2008; Awolola et al., 2009), desiccation resistance (Zhang et al. 2008), resistance to heavy metals (Shaw et al., 2007; Roelofs et al., 2009), response to changing photoperiod (Gallot et al. 2010), and even strain differences and mating (Cassone et al., 2008; Rogers et al., 2008). Obviously, we need to understand how the individual cuticular proteins are contributing in such major ways to such important events.

Cuticular proteins with pfam00379 are one of the largest multigene families found in D. melanogaster fLespinet et al., 2002), and their numbers are far larger in mosquitoes and Bombyx. We need more information about whether this multiplicity serves to allow rapid synthesis of cuticle, or whether different genes are used to construct cuticles in different regions. If the latter, the question becomes whether subtle differences in sequence are important for different cuticular properties, or if gene multiplication has been exploited to allow precise temporal and spatial control. We also need to learn how two hymenopterans, Apis mellifera and Nasonia vitripennis, manage with far fewer genes for cuticular proteins; is it their protected larval and pupal stages, or something else? The elegant immunolo-calization studies that have been carried out were done with antibodies against proteins whose sequences for the most part are unknown. Now that we recognize that several genes may have almost identical sequences, we have to be very careful in designing specific probes for use in Northern analyses, for in situ hybridization, for qRT-PCR, and for immunolocalization, if our goal is to learn the use to which each individual gene is put.

Cuticular protein sequences are certain to be described in ever-increasing numbers as more insect genomes are analyzed. Describers need to be careful to submit to databases an indication of whether assignment as a cuticular protein is based on sequence alone, or on some type of corroborating evidence. It would be helpful if there were a more consistent system for naming cuticular proteins. At the very least, each protein should have a designation of genus and species, and a unique number ideally preceded by the gene family name – e.g., AgamCPR52, DmelTWDL12.

A wealth of sequence information is available already for cuticular proteins, but many challenges lie ahead for those who wish to continue to further our understanding of how the diverse forms and properties of cuticle are constructed extracellularly as these proteins self-assemble in proximity to chitin, and make specific contributions to the properties of the exoskeleton.

Next post:

Previous post: