Antitermination Control of Gene Expression (Molecular Biology)

Bacteria have evolved many different complex mechanisms to control both transcription and translation of genes in response to environmental changes. In many cases, transcription is controlled at the level of initiation by DNA-binding proteins that either inhibit (repressors) or stimulate (activators) initiation. In addition, transcription can be regulated at the level of elongation. In some cases, transcription of a gene or operon will terminate prematurely in the absence of the action of a positive regulatory molecule. In these cases, antitermination factors allow transcription to read through termination signals and to generate full-length transcripts.

Two fundamentally different mechanisms for antitermination have been described. In one case, RNA polymerase is modified so as to allow it to read through transcription terminators. This type of mechanism controls phage development (1) and expression of rRNA operons (2). The second mechanism, covered in this chapter, involves trans-acting factors that interact with RNA and prevent formation of the terminator structure. This mechanism is very similar to attenuation, but antitermination can be distinguished from attenuation in that the action of the regulatory molecule results in transcription readthrough, with the default pathway being premature termination. In attenuation, the action of the regulatory molecule induces transcription termination, and the default pathway is readthrough.

Three distinct mechanisms that regulate gene expression by antitermination will be reviewed here. These mechanisms differ primarily in the type of biomolecule used as the regulator. The first mechanism uses antiterminator proteins that are activated to bind RNA targets in response to environmental stimuli. In the second mechanism, transfer RNA is used as the regulator. In this case, the degree of aminoacylation (or charging) of the tRNA is used to sense the availability of the cognate amino acid within the cell, to induce expression of genes involved in metabolism of this amino acid. Finally, in the case of the Escherichia coli tryptophanase operon, it appears that ribosomes are used as the regulatory molecule. Thus, bacteria have evolved a large number of mechanisms to use different biomolecules to all perform the same task—that is, to alter the conformation of the nascent mRNA to signal RNA polymerase whether it should terminate prematurely or continue transcription of the particular structural gene(s).

1. RNA-Binding Protein-Mediated Antitermination: The Sac/Bgl Family of Antiterminator Proteins

Expression of several catabolic operons in bacteria is regulated by antitermination involving RNA-binding proteins. These proteins prevent formation of Rho-independent transcription terminators in the nascent mRNA upstream of the regulated gene(s) (3). One such system in E. coli and several in Bacillus subtilis, appear to be highly related based on similarities of their antiterminator proteins as well as their RNA targets. In addition, several other systems function similarly but appear to have arisen independently. A general model for this mechanism is shown in Figure 1.

Figure 1. A general model for antitermination control by the Sac/Bgl family of antiterminator proteins. Under noninducing conditions, transcription starts at the promoter (designated by the arrow) and terminates prematurely, often in a leader region prior to the structural genes. In the presence of inducer, the antiterminator protein is activated to bind to the RAT (ribonucleic antiterminator) RNA. This binding stabilizes an RNA secondary structure involving the RAT, which prevents formation of the overlapping terminator, and transcription continues into the structural genes.

1.1. The E. coli bgl Operon

The E. coli bglGFB operon encodes all the functions necessary for the regulated uptake and utilization of aromatic b-glucosides. The operon is cryptic in wild-type strains but can become functional through spontaneous mutations. When functional, expression of this operon is regulated by antitermination mediated by the BglG protein in response to the levels of b-glucosides (4). In the absence of inducer, BglG does not bind RNA, and most transcripts terminate at one of two Rho-independent transcription terminators present in the leader region upstream of bglG and between bglG and bglF. When b-glucoside levels are high, BglG binds to an RNA target, named RAT for ribonucleic antiterminator, just upstream of the terminators. This binding stabilizes an alternative antiterminator RNA structure, which prevents formation of the terminator, thus allowing transcription to continue and the operon to be expressed.

The RNA-binding activity of BglG is regulated by phosphorylation mediated by BglF. In the absence of b-glucosides, BglF phosphorylates BglG, which prevents it from dimerizing and binding to the RAT (5). In the presence of b-glucosides, BglF dephosphorylates BglG, which now dimerizes and binds to the RAT. Phosphorylation of both b-glucosides and BglG is accomplished by transfer of the phosphate group from the same phosphorylated residue, Cys24, in BglF (6). These results suggest that, under conditions in which b-glucoside levels are high, the phosphate group can be transferred from BglG back to Cys24 in BglF. A model has been proposed in which unliganded BglF phosphorylates BglG, and b-glucoside binding induces BglF to undergo a conformational change that activates it to dephosphorylate BglG.

A similar system for b-glucoside utilization exists in the related Gram-negative enteric bacterium Erwinia chrystanthemi, although in this case the arb operon is not cryptic (7). ArbG shows high sequence similarity to BglG, suggesting that it functions analogously in antitermination control of the E. chrystanthemi arb genes. Antitermination also appears to control b-glucoside operons in several Gram-Positive Bacteria as well. A putative b-glucoside (bgl) operon has also been identified in B. subtilis and may be regulated by a similar antitermination mechanism (8). In addition, a protein, BglR, with homology to BglG also controls b-glucoside usage in Lactococcus lactis (9).

1.2. The B. subtilis sac Genes

Expression of two sucrose utilization operons in B. subtilis, sacPA (10) and sacB (11), is induced by sucrose via transcription antitermination mediated by the RNA-binding proteins SacT and SacY, respectively. SacT and SacY show extensive sequence similarity to each other, as well as to BglG from E. coli. The antitermination mechanisms that control these genes also appear to be quite similar to that described above for the E. coli bgl operon. Rho-independent transcription terminators exist in leader regions upstream of both sacPA and sacB and prevent transcription of the structural genes in the absence of the inducer, which is sucrose. In the presence of sucrose, SacT and SacY are activated to bind RAT sequences in the sacPA and sacB leader transcripts, respectively, and allow transcription to read through into the structural genes (12). Like BglG in E. coli, both of these antiterminator proteins are phosphorylated. In the case of SacY, phosphorylation negatively regulates RNA-binding activity and appears to be mediated by SacX (13). SacT is phosphorylated by HPr, a component of the phosphoenolpyruvate phosphotransferase system, but the role of this phosphorylation in sucrose-mediated antitermination is less clear (12).

Recently, the protein structure of the RNA-binding domain of SacY has been determined by both NMR (14) and X-ray crystallography (15). The domain exists as a dimer, with each monomer consisting of a four-stranded antiparallel beta-sheet. Several amino acid residues have been identified through genetic, biochemical, and preliminary NMR studies as being important for RNA binding. These residues are clustered on the surface of one side of the protein structure (15).

1.3. Other Examples of Bgl/Sac Type Antiterminators

In addition to the bgl and sac systems described above, several other operons are regulated by RNA-binding antiterminator proteins with homology to BglG, SacY, and SacT. LicT regulates the licS gene, which is involved in b-glucan utilization in B. subtilis (16). There is also a RAT sequence overlapping a potential Rho-independent terminator upstream of licS.

In Lactobacillus casei, the lactose (lac) operon is regulated in response to lactose levels by LacT, which shows sequence homology to the other members of the Bgl/Sac family of antiterminators (17). The 5′ -leader region of the lac mRNA contains a region with sequence similarity to the RAT sequence, as well as a potential stem-loop structure resembling a Rho-independent terminator.

1.4. Antiterminators with No Similarity to the Bgl/Sac Family

Several other systems are regulated by RNA-binding antiterminator proteins that are unrelated to those of the Bgl/Sac family; furthermore, these proteins do not appear to be related to each other. These regulatory systems thus appear to have arisen independently.

In B. subtilis, both the glp regulon, which is involved in usage of glycerol-3-phosphate, and a histidine-utilization (hut) operon are regulated by RNA-binding antiterminator proteins; GlpP (18) and HutP (19), respectively. The amino acid sequences of these antiterminator proteins are not similar to any other antiterminator proteins. Further, the mechanisms by which these antiterminator proteins function appear to be different from those described above, because there are no clear antiterminator RNA secondary structures near the terminators in these operons.

The amidase (ami) operon of Pseudomonas aeruginosa is regulated by antitermination in response to short-chain aliphatic amides, such as acetamide. The amiR gene encodes an antiterminator protein (AmiR), which is negatively regulated by AmiC, apparently through formation of an AmiC-AmiR complex (20). Acetamide destabilizes the AmiC-AmiR complex, leading to antitermination and expression of the operon. AmiR interacts with an RNA target in the 5-leader region of the ami mRNA that contains a Rho-independent terminator. However, no clear antiterminator RNA secondary structure is predicted. AmiR binding has been suggested to function in antitermination by interfering directly with formation of the terminator stem-loop structure (20).

In addition to all the catabolic operons described above, one anabolic operon has been shown to be regulated by antitermination. Expression of the nas operon of Klebsiella pneumoniae, which encodes enzymes required for nitrate assimilation in this bacterium, is induced by nitrate or nitrite. The NasR protein mediates transcription antitermination through a terminator in the leader region of the operon (21). This protein shows weak homology with AmiR in the carboxyl-terminal region.

2. Transfer RNA-Mediated Antitermination

An interesting variation on the antitermination mechanism involves the use of tRNA as the regulatory molecule. This mechanism regulates a large number of aminoacyl-tRNA synthetase genes in Gram-Positive Bacteria (22, 23) and several amino acid biosynthetic operons, including the ilv-leu, in B. subtilis (24, 25), and the his and trp operons in Lactococcus lactis (26). Expression of these genes is induced specifically by starvation for the corresponding amino acid. In the case of the amino acid operons, insufficient levels of the amino acid leads to increased expression of the corresponding biosynthetic operon. For the aminoacyl-tRNA synthetase genes, increasing the level of the synthetase is thought to allow more efficient charging of the cognate tRNA when the corresponding amino acid pool is low.

A long (approx. 300-nucleotide) untranslated leader region exists upstream of the structural gene(s) of these operons that contains several conserved features, including three stem-loop structures preceding a Rho-independent transcription terminator. Hence, in the absence of the inducing signal, transcription terminates prematurely in the leader region prior to the coding sequences. In addition to the conserved secondary structures, there is an important conserved 14-nucleotide sequence known as the T-box present in each leader region; hence these genes are known as the T-box family. An alternate arrangement of the leader region involving base-pairing between a portion of the T-box and a conserved sequence in the 5′ side of the terminator stem has been proposed to form an antiterminator structure that allows transcription to read through into the structural genes (Fig. 2) (27).

Figure 2. Model for antitermination control by tRNA. Under conditions with adequate levels of the cognate amino acid (aa), the charged tRNA does not interact with the leader region, and the terminator forms. Under conditions of starvation for the appropriate amino acid, the uncharged tRNA interacts with the leader region via base-pairing between the anticodon and the specifier sequence, and by base-pairing between the CCA sequence at the acceptor end of the tRNA with the side bulge of the antiterminator in the leader. These interactions stabilize formation of the antiterminator conformation of the leader transcript, resulting in induction of expression of the gene. The tRNA is shown as the shaded cloverleaf structure, and a boxed "A.A." attached to the tRNA indicates it is aminoacylated.

Another important conserved feature of the leader region of these genes is the presence of a triplet sequence corresponding to a codon for the appropriate amino acid for each operon. For example, in tyrS, which encodes tyrosyl-tRNA synthetase, the leader contains a UAC tyrosine codon, while the ilv-leu operon leader contains a CUC leucine codon. This triplet is always present in a bulged sequence in Stem-loop I (Fig. 2) and has been shown to be critical for induction in several systems. It was the presence of these triplets that led to the hypothesis that tRNAs play a role in this regulatory mechanism. This triplet was designated the "specifier sequence" because, in the case of the B. subtilis tyrS gene, altering the sequence to correspond to a codon for another amino acid switched induction to respond to starvation for the new amino acid (27). Other experiments demonstrated that translation of this codon was not involved in induction and that uncharged tRNA was the inducer (27). In addition, a second interaction between the CCA sequence at the 3′ end of the uncharged tRNA and the complementary UGG sequence in the T-box have been shown to be important (28).

A model for tRNA-regulated antitermination regulation is presented in Fig. 2. Under starvation conditions for the corresponding amino acid, the cognate uncharged tRNA interacts with two sites in the leader region, to induce formation of the antiterminator structure and allow transcription to read through into the coding region. Aminoacylation of this tRNA is predicted to interfere with the interaction at the CCA end and prevent the charged tRNA from binding; the leader transcript then folds into the conformation with the terminator, halting transcription. It is not known if factors in addition to tRNA are required for antitermination. To date, however, it has not been possible to reconstitute tRNA-mediated antitermination in vitro, and several other lines of evidence also suggest that other factors may be involved in this mechanism (23).

In addition to the antitermination mechanism described above, processing of the leader RNA has been shown to play a role in regulating expression of the B. subtilis thrS gene (29). Cleavage occurs in the loop of the antiterminator near the T-box sequence and is more efficient under threonine starvation conditions, suggesting that bound tRNA induces both antitermination and RNA processing. This processing increases the stability of the mRNA, which would allow for increased translation and production of the threonyl-tRNA synthetase. Thus induction of expression of this gene in response to threonine starvation occurs at both the level of transcription antitermination and mRNA stability.

3. The E. coli Tryptophanase Operon

E. coli and several other microorganisms have the capacity to degrade tryptophan as a source of carbon, nitrogen, and/or energy (30). The degradative tryptophanase operon (tnaCAB) of E. coli is regulated by catabolite repression (31) and by an antitermination mechanism. Antitermination involves translation of a cis-acting 24-residue leader peptide (tnaC) containing a critical Trp codon (32, 33), one or more RNA polymerase pause sites between tnaC and tnaA (34), and Rho termination factor (34). While the precise antitermination mechanism responsible for controlling the tna operon is not firmly established, all of the data are consistent with the following model (Fig. 3) (35). During growth in a medium lacking both tryptophan and a catabolite-repressing carbon source, transcription initiation is efficient. As transcription proceeds, translation of the leader peptide occurs as soon as the coding sequence becomes available. Once the translating ribosome reaches the UGA stop codon, ribosome release exposes a rut (Rho utilization) site that immediately follows the stop codon. Rho then binds to the rut site and begins to translocate in the 3′-direction, until it encounters paused RNA polymerase, ultimately leading to transcription termination upstream of tnaA. When cells are growing with inducing levels of tryptophan, TnaC, or a complex of TnaC with an unidentified protein, prevents ribosome release at the tnaC stop codon, thereby masking the rut site and, hence, blocking Rho interaction with the transcript. Eventually RNA polymerase would overcome the pause signal and transcribe the structural genes encoding tryptophanase (tnaA) and a tryptophan permease (tnaB). This model assumes that there is a fundamental difference between the TnaC peptide, or the TnaC peptide-protein complex, in cells growing with or without tryptophan. It was proposed that such a complex under inducing conditions would prevent ribosome release (35), reminiscent of characterized translation attenuation mechanisms (36). The tryptophanase operon of Proteus vulgaris is thought to be regulated by a mechanism essentially identical to that of E. coli (37).

Figure 3. Model of E. coli tna operon regulation. Under noninducing conditions (no extracellular tryptophan), ribosome dissociation at the tnaC stop codon exposes a rut site, allowing Rho binding. Rho translocates to the paused RNA polymerase, leading to transcription termination. Under inducing conditions (extracellular tryptophan), ribosome stalling at the tnaC stop codon prevents Rho association, leading to transcription readthrough.