Investigating protein-protein interactions in multisubunit proteins: the case of eukaryotic RNA polymerases (Proteomics)

The development of protein purification based on fast and nondisruptive affinity purification methods rather than on conventional chromatography has provided growing evidence that many proteins may exist in vast heteromultimeric associations with molecular weights reaching 1 MDa or above (see Article 99, Large complexes by X-ray methods, Volume 6 and Article 100, Large complexes and molecular machines by electron microscopy, Volume 6). Meanwhile, mass spectrometry (see Article 7, Time-of-flight mass spectrometry, Volume 5) coupled with genome sequence knowledge and to the precise annotation of model organisms such as Saccharomyces cerevisiae (see Article 43, Functional genomics in Saccharomyces cerevisiae, Volume 3, Article 97, Seven years of yeast microar-ray analysis, Volume 4, and Article 39, The yeast interactome, Volume 5) has considerably accelerated the identification of polypeptides belonging to such complexes. Thus, the Tandem Affinity Purification method (Rigaut et al., 1999), in which two different affinity tags (a calmodulin-binding peptide and Staphylococcus aureus protein A) allow the fast purification of multiprotein complexes at near-physiological ionic strengths, preserves structures that are typically disrupted by conventional purification procedures. When applied to the yeast proteome (Gavin et al., 2002; Ho et al., 2002), this approach suggests that the eukaryotic cell may harbor several hundreds of these complexes (see Article 35, Structural biology of protein complexes, Volume 5).


Some of these structures have a remarkably stable subunit composition, like the mRNA polyadenylation complex where tandem affinity purification from different polypeptides consistently yielded the same set of 20 different subunits. Others have a much more dynamic organization. Consider, for example, a molecule of DNA-dependent RNA polymerase (Pol) II that successively finds a promoter, passes by an early phase of “abortive” transcription (probably without moving along its DNA), then moves along its template to synthesize RNA in a processive way, and finally terminates transcription before it possibly recycles onto the same promoter. During these avatars of the Transcription Cycle, the same molecule is associated with a nearly 1-MDa Mediator, with the General Transcription Factors TFIIB, TFIIE, TFIIF, TFIIH, with more or less well-defined elongation factors such as TFIIS, Spt4, Spt5, and some others, with the capping enzyme and perhaps with the polyadenylation machine mentioned above (Woychik and Hampsey, 2002). The cumulated mass of this ensemble of more than 70 polypeptides would be close to the size of ribosomes (4.2 MDa in eukaryotes), but it is rather unlikely that a huge “transcriptosome” of that size exists as a stable structure. Indeed, when yeast Pol II subunits were purified by tandem affinity purification, they consistently yielded a 19 polypeptide complex of about 0.75 MDa with the 12-subunit Pol II, the 3-subunit General Transcription Factor TFIIF, two elongation factors (Spt4 and Spt5), and the mRNA cap methylase, which is probably the elongating form of RNA polymerase II. Initiation Factors such as TFIIB, TFIIE, TFIIS, and TFIIH did not copurify with Pol II in tandem affinity purification, although there is good evidence that they directly bind Pol II. There was also no trace of the Mediator, although the latter can stably bind Pol II in vitro, thus forming edifices of about 1.5 MDa that can be visualized by electron microscopy.

Knowing the structure of such a huge and more or less transient complex at an atomic level of resolution is undoubtedly one of the most important challenges of modern protein chemistry (see Article 99, Large complexes by X-ray methods, Volume 6 and Article 100, Large complexes and molecular machines by electron microscopy, Volume 6). Short of unforeseen technical breakthroughs, however, this is unlikely to be done rapidly. In the very well studied case of the yeast 12-subunit Pol II complex (Cramer et al., 2000; Cramer et al., 2001; Armache, 2003; Bushnell, 2003), it took more than 15 years to obtain the (nearly) complete atomic structure of the 12-subunit yeast Pol II, starting from electron microscopic studies that provided the first low-resolution images of RNA polymerases I and II (Edwards et al., 1990; Schultz et al., 1990). These images paved the way for a very recent atomic reconstitution of Pol II associated with the mononomeric initiation factor TFIIB and elongation factor TFIIS. In most cases, however, cell biologists still largely rely on generic methods for investigating protein complexes in terms of protein-protein interactions and for obtaining at least a broad picture of their general organization. These methods will never replace the precise knowledge brought by crystallographic structures, but are extremely important for obtaining an overall view of the subunit organization of such complexes and of their modular structure.

Protein-protein interaction mapping is based on the idea that individual polypep-tides have autonomous folds and that their interactions can therefore be reconstituted when presenting to each other two interacting partners in physiological conditions. The aim, therefore, is to detect the formation of stable heterodimers by suitable biochemical or genetic tests. The biochemical approach is based on protein “pull-down” assays, where one partner polypeptide is fused to an affinity tag (e.g., glutathione-S-transferase or polyhistidine) and bound to a resin (e.g., glutathione Sepharose or Ni-NTA agarose), the putative partner protein being chromatographed through the affinity resin. The retention of a given partner is detected by antibodies directly raised against that partner itself or against suitable epitopes fused to it.

The setup can vary widely. In one form, the binding test is done between two purified proteins. This method is very sensitive but its specificity is questionable since the concentrations of the purified proteins can be in far excess over those that are found in the cell. Indeed, the fact that proteins tend to bind to each other by fairly unspecific interactions is a major source of false-positive. Testing interactions between proteins expressed from whole-cell extracts, typically of Escherichia coli or insect viruses (baculoviruses), has the advantage that the protein used as “bait” is challenged for nonspecific interactions by the whole-cell-free extract itself. False-positive interactions are therefore likely to be less frequent, as long as the two partners are not massively overproduced from the host cell.

In the genetic two-hybrid test (Fields and Song, 1989), one polypeptide is typically fused to the Gal4 DNA binding domain (Gbd) and tested against putative partners fused to the Gal4 activation domain (GAD). If able to interact, they will form an active heterodimeric Gal4 activator, inducing the transcription of suitable yeast reporter genes in vivo. An interesting development of the two-hybrid method is readily applicable to organisms with compact genomes such as yeast, where it is possible to screen a given bait protein for its ability to interact with a library of small genomic fragments (around 700 bp; Fromont-Racine et al., 1997). Since intergenic distances are short and since introns are rare in S. cerevisiae, this essentially amounts to testing a random library of coding sequence fragments. In a typical experiment, 107 clones can be tested, usually yielding several dozens of interactants. This approach provides an excellent indicator of specificity, as distinct but overlapping fragments corresponding to the same partner should be isolated in the same screen. Comparing the limits of these fragments then allows to rather precisely delineate the interacting domain. This approach has been used to investigate the organization of complex edifices such as the yeast 17-subunit Pol III (Flores etal., 1999).

The atomic structure of Pol II, where the 12 subunits are connected by 16 pair-wise interactions, offers a rare opportunity for comparing the predictive power of these two approaches and leaves little doubt as to the superiority of the two-hybrid approach, especially when the latter is performed in a library of fragments. GST pull-down assays have been performed with the 12 subunits of the human Pol II, and was also extended to the fission yeast (Kimura and Ishihama, 2000). This can be readily compared to the yeast structural data, since there is considerable homol-ogy between the two RNA polymerases and since several of the human subunits can functionally replace their yeast counterparts in vivo. In the human enzyme, the method correctly detected eight interactions, missed eight others, and had seven false-positives (Edwards et al., 2002; Figure 1). False-positives were therefore a major problem, especially with some small subunits that were (falsely) predicted to interact with many partners. For example, the Rpb5 subunit interacted with five partners, including Rpb1, in the pull-down experiments, while it only binds the latter in the crystal structure. The S. pombe Rpb3 subunit was also seen to interact with six partners, of which only two (Rpb2 and Rpb11) actually did in the structure.

The analysis of yeast RNA polymerase III (Flores et al., 1999) suggests that the two-hybrid approach may be much less prone to false-positives. RNA polymerase III is a very complex enzyme that contains no less than 17 subunits (Siaut et al.,2003). A core of 12 subunits is homologous (7) or even identical (5) to the 12-subunit crystal structure of Pol II (Armache, 2003; Bushnell, 2003). Nine of the 16 interactions predicted by the Pol II structure were found in the two-hybrid analysis of Pol III, with only one false-positive (Rpb2-Rpb11). Moreover, the domains of interactions predicted by this analysis matched the ones found in the Pol II structure. The five Pol III-specific subunits were also included in this two-hybrid screening, which allocated them to two distinct groups, one formed by the Rpc53 and Rpc37 subunits and one formed by Rpc31, Rpc34, and Rpc82. The case of Rpc37 was quite striking, since this polypeptide was first identified as a specific partner of Rpc53 before being proven to be a bona fide subunit of Pol III (Flores et al., 1999). The other three subunits interacted with each other, and are also connected to the 12-subunit core (via a Rpc31-Rpc17 interaction) and to the TFIIIB initiation factor (via Rpc34-Brf1 and Rpc17-Brf1 interactions). Brf1 is a conserved polypeptide that has close homology to the TFIIB initiation factor of Pol II, and is one of the two polypeptides forming the Pol III initiation factor TFIIIB. The physiological relevance of these interactions was supported by genetic studies where amino acid replacements were selected to impair the two-hybrid interaction and were then shown to result in transcription defects (Andrau et al., 1999; Brun et al., 1997; Ferri et al., 2000) or dissociation of the corresponding subcomplexes (Werner et al., 2002). Moreover, the effect of such mutations was in some cases corrected by the overexpression of other components of the same complex (Briand et al., 2001).

Organization of the subunits in yeast and human RNA polymerases (a) Projection of the RNA polymerase II structure (S. cerevisiae). The location of the subunits is indicated within the contour of the model (according to Cramer et al., 2000, 2001). Rpb1 and Rpb2 were omitted for clarity. Color code: Rpb3: red; Rpb4: dark green; Rpb5: magenta; Rpb6: blue; Rpb7: violet; Rpb8: light green; Rpb9: orange; Rpb10: black; Rpb11: yellow; Rpb12: gray. (b) Protein interactions in RNA polymerase II, according to the crystallographic analysis (Cramer et al., 2000, 2001). The connections between the Pol II subunits are indicated by the black lines. The color code is the same as in (a). (c) Protein interactions in human RNA polymerase II, according to the GST pull-down experiments (Acker et al., 1997). The red lines represent false-positive interactions. Dashed lines correspond to interactions that remained undetected. (d) Protein interactions in yeast RNA polymerase III, according to the two-hybrid assay (Flores et al., 1999). The code is as in (c)

Figure 1 Organization of the subunits in yeast and human RNA polymerases (a) Projection of the RNA polymerase II structure (S. cerevisiae). The location of the subunits is indicated within the contour of the model (according to Cramer et al., 2000, 2001). Rpb1 and Rpb2 were omitted for clarity. Color code: Rpb3: red; Rpb4: dark green; Rpb5: magenta; Rpb6: blue; Rpb7: violet; Rpb8: light green; Rpb9: orange; Rpb10: black; Rpb11: yellow; Rpb12: gray. (b) Protein interactions in RNA polymerase II, according to the crystallographic analysis (Cramer et al., 2000, 2001). The connections between the Pol II subunits are indicated by the black lines. The color code is the same as in (a). (c) Protein interactions in human RNA polymerase II, according to the GST pull-down experiments (Acker et al., 1997). The red lines represent false-positive interactions. Dashed lines correspond to interactions that remained undetected. (d) Protein interactions in yeast RNA polymerase III, according to the two-hybrid assay (Flores et al., 1999). The code is as in (c)

Most cell biologists would agree that the association of proteins with each other has been a major playground of evolution, improving their biological performance in the extremely crowded environment of living cells. In the eukaryotic cell, one could therefore view the nucleoplasm, the cytosol, and also the matrix compartment of organelles as highly organized communities of multiprotein complexes, rather than as crowded swimming pools for individualistic proteins. However, a long time will pass before structural data provide a general view of how most of these complexes are organized in terms of subunit interactions. In the meantime, there is ample room for generic methods such as the two-hybrid screening approach discussed above and even the somewhat less reliable protein pull-down approach. If carefully applied, these methods can provide important insights into the organization of multiprotein complexes, especially when supported by the in vivo or in vitro analysis of yeast mutants defective in the corresponding protein-protein interactions.

Next post:

Previous post: