Lambda Phage (Molecular Biology)

Bacteriophages are viruses that infect and reproduce within bacterial cells. Phage l has been one of the most intensively studied of all the bacteriophages. It has provided models for biological processes such as DNA replication, transcription, regulation of gene expression, developmental switches, informational suppression, homologous and site-specific recombination, restriction-modification, and morphogenetic pathways, among others, as well as an incomparable window into the biology of its host, Escherichia coli. The genome of l was among the earliest complete genomes to be sequenced (1). Such detailed knowledge allowed l to be exploited further as a vector for recombinant DNA experiments, to the point that l is a standard tool of the contemporary molecular genetics laboratory, as well as a subject of study in its own right in laboratories worldwide. Phage l is a member of the temperate group of bacteriophages, which means that an infection may progress along either of two alternative pathways. In the lytic pathway, the host cell is destroyed ("lysed") concomitant with the production and release to the environment of a hundred or so progeny l. In the lysogenic pathway, the infected bacterium survives in the form of a carrier cell known as a lysogen. This chapter focuses on lytic development of l (see Lysogeny for the alternative pathway).

1. Initiation of Infection

A particle of l consists of about half protein and half DNA. The protein comprises an icosahedral head, about 50 nm in diameter, attached to a flexible tubular tail about 150 nm long. The DNA, which is contained within the head, is a linear double-stranded molecule, 48,490 base-pairs in length, not counting the complementary single-stranded extensions of 12 nucleotides in length at the 5′ end of each strand. These cohesive, sticky ends (cos) are generated in the course of phage morphogenesis as discussed below. Infection begins with the attachment of l to the target cell, E. coli K-12. The cellular receptor is an outer membrane protein, LamB, which is also part of the transport system for maltose. l interacts with LamB through a protein located at the tip of its tail. After phage attachment, the DNA exits the phage head through the tail and enters the bacterial cytoplasm after traversing the outer membrane, periplasm, and inner membrane by an unknown mechanism. Unlike some other phages, l does not use a contractile tail to assist this process.


Immediately upon DNA entry into the cytoplasm, the single-stranded extensions of the DNA pair with one another, and the DNA is converted to a covalently closed circle though the action of host-cell DNA Ligase(Fig. 1). This simple step is essential to the successful completion of lytic development for at least two reasons. First, replication of the l DNA requires supercoiling, which is only possible in a covalently closed molecule. Second, transcription of approximately half the l genome (the antiterminated P’r transcript; see text below) traverses this site of end joining.

Incidental to the main topic here, circularization is also essential to the lysogenic pathway of development, because a circle is the substrate for integration of the phage DNA into the host chromosome. From this point in the infection cycle, several biochemical activities are carried out on the DNA more or less simultaneously, including transcription, DNA replication, and site-specific recombination (lysogenic development only). For simplicity, transcription and replication are considered separately.

Figure 1. Circularization of the l genome. (a) The linear genome as found in the virion. The sequences of the single-stranded 5′ extensions are shown. Upon infection, the extensions pair and the nicks are sealed by E. coli DNA ligase to form the circular genome as shown in b. The positions of genes flanking the joint are shown to facilitate comparison with the transcription map of Figure 2.

Circularization of the l genome. (a) The linear genome as found in the virion. The sequences of the single-stranded 5' extensions are shown. Upon infection, the extensions pair and the nicks are sealed by E. coli DNA ligase to form the circular genome as shown in b. The positions of genes flanking the joint are shown to facilitate comparison with the transcription map of Figure 2.

2. Transcription

The early transcription program is compatible with either lytic or lysogenic development. Soon, however, each l infection tilts decisively one way or the other, and the programs thereafter are distinct. It therefore should come as no surprise that l has deployed timing mechanisms to delay the onset of expression of some genes until the decision for lysis or lysogeny has been made. Such timing prevents the inappropriate expression of, for example, gene products that would lyse a cell that was destined to become a lysogen. The principal feature of the timing is a regulatory cascade in which products of early transcription are required for expression of late genes. These positive activators are antiterminators of transcription, agents of a mechanism of gene control that arguably has attained its most sublime refinement in l.

The outlines of the lytic transcription program are depicted in Figure 2. The host RNA polymerase, unaided by other factors, recognizes three promoters within l DNA, designated Pl (leftward), Pr  (rightward), and Pr’ (the so-called late promoter). In the absence of phage-encoded antitermination factors, the Pl transcript terminates after gene N. About half the Pr transcripts terminate after gene cro, the remainder after gene P. The Pr’ transcripts terminate almost immediately to produce a short (194 nucleotide, or 6S) RNA that does not encode any protein. Thus the early proteins synthesized by l are N and cro and comparatively lesser amounts of cII, O, and P. The N protein is one of l’s transcription antiterminators. It is specific for transcription beginning at Pl and Pr. As the level of N  increases, new transcription beginning at Pl ignores the terminator after gene N and reads genes to the left of N, including cIII, red, xis, and int. Similarly, new transcription beginning at Pr ignores the  weak terminator after gene cro as well as the strong terminator after gene P, with the result that synthesis of cII, O, and P increases, and a new gene, Q, is transcribed for the first time. The Q protein is l’s other transcription antiterminator. It is specific for transcription beginning at Pr’. As the  level of Q increases, new transcription beginning at Pr’ ignores its strong terminator and reads genes to the right of Q, including S, R, and, after traversing the cos site as mentioned above, some 20 additional genes encoding structural proteins of the phage particle and enzymes involved in morphogenesis.

Figure 2. Lytic transcription program. The map shows some l genes (not to scale) and the locations of three promoters used in the lytic program. Note that the template is actually circular (or later, concatemeric), but for convenience it is shown as linearized between genes J and int. Early transcripts are formed by the host RNA polymerase, delayed transcripts require l N protein, and late transcripts require the l Q protein.

Lytic transcription program. The map shows some l genes (not to scale) and the locations of three promoters used in the lytic program. Note that the template is actually circular (or later, concatemeric), but for convenience it is shown as linearized between genes J and int. Early transcripts are formed by the host RNA polymerase, delayed transcripts require l N protein, and late transcripts require the l Q protein.

In summary then, l transcripts may be classified as early, delayed, and late. Early transcription results in the synthesis of N protein, which stimulates transcription of the delayed class of genes. The latter includes Q, whose product is required for late gene transcription.

How is this pattern of gene expression altered in infections leading to lysogeny rather than lytic growth? Such infections are characterized by higher levels of cll protein (see Lysogeny). cll is a transcriptional activator specific for three promoters in l. One of these promoters is the Pre promoter, which is responsible for a burst of lambda repressor synthesis. The repressor turns down expression of early and delayed transcripts by preventing transcription initiation at Pl and PR (see Lambda repressor). In addition, cll activates the PaQ (antisense-Q) promoter. As the name implies, this promoter forms a noncoding antisense transcript of gene Q, which delays expression of Q protein and the late genes (2). During infections characterized by high levels of cll, these two effects conspire to delay and ultimately to prevent late-gene expression, exactly as required if the cell is to be successfully lysogenized. (The third promoter controlled by cll, Pint, is also essential for lysogenization, but has no effect on the lytic transcription pattern.) Just as a commitment to lysogeny entails reduction of early and delayed transcription as described above, curiously, so also does commitment to lytic development. In this case, however, the cII level is low, and the repressor is not made in significant amounts. Instead, the Cro protein predominates. This protein, like the repressor, binds near Pl and PR and prevents initiation at those promoters. Because Cro is a product of PR, this is an autoregulatory circuit (see Cro protein).

With two transcriptional antiterminators in its arsenal, how does l arrange for N to be specific to Pl and PR transcripts and Q to be specific to PR’ transcripts? This is best understood for the case of N. The Pl and Pr transcription units each contain a site called nut (for N-utilization; nutL and nutR, respectively). The nut transcript is a recognition site for N, which binds the nascent RNA at this site and transfers to the transcription complex. N remains with the transcription complex, endowing it with the ability to ignore potential termination sites encountered thereafter (3). Transcription units without nut are indifferent to N; conversely, hybrid transcription units in which nut has been inserted become competent for N-mediated antitermination. Q protein, in contrast, binds to the DNA of the Pr ‘ promoter and interacts with the transcription complex while it is paused at nucleotide 16 or 17 of the transcript (4) (see Antitermination).

3. DNA Replication

For the most part, phage l uses replication enzymes of the host cell (see DNA Replication). The main thing l must do is divert these enzymes from their normal responsibilities and direct them to the phage DNA. For this purpose, l encodes two specific replication proteins, called O and P. They are synthesized early (and also delayed), as described above; thus replication can begin soon after the initiation of infection. The O protein nucleates the assembly of a competent replication complex for l by binding to the phage origin of replication. The P protein, a functional homologue of the host DnaC protein, binds to the host replicative DNA helicase (DnaB protein) and delivers it to the phage origin complex by interacting with the O protein. DnaB is disassembled from P and is thus activated for its replication function with the help of three host-encoded heat-shock proteins, DnaJ, DnaK, and GrpE (5) (see DnaK/DnaJ Proteins).

DNA replication proceeds in two distinct phases. Initially, the substrate is a covalently closed circle, formed by cos joining in the early moments of infection. A pair of oppositely oriented replication forks set up at the origin form a replication bubble that grows bidirectionally until the entire circle is replicated. This is so-called q (theta) replication, named after the shape of the partially replicated circle. Later, replication switches to the rolling circle DNA replication mode (Fig. 3). The mechanism of this switch is not understood. In rolling circle replication, a single replication fork moves perpetually around a circle, spinning off a long linear tail, much like paper towels unwinding from a roll. Rolling circle replication is an important preparatory step to phage morphogenesis, because l normally uses the concatemeric (i.e., multiple genome-length) DNA tail as the substrate for packaging. l has a further adaptation to facilitate rolling circle replication. Ordinarily, double-stranded DNA ends, such as the end of the rolling circle tail, are unstable in E. coli because they are degraded by the RecBCD nuclease (see Recombination). l encodes a RecBCD inhibitor (the gam gene product) to avoid this complication.

Figure 3. Rolling circle replication. A perpetual replication fork moving around a circular template that is monomeric, as indicated by the single cos site, produces a concatemeric tail. Each cos-to-cos interval of the tail carries a complete copy of the l genome and reflects one passage of the replication fork around the circle. The tail is the substrate used for l packaging.

Rolling circle replication. A perpetual replication fork moving around a circular template that is monomeric, as indicated by the single cos site, produces a concatemeric tail. Each cos-to-cos interval of the tail carries a complete copy of the l genome and reflects one passage of the replication fork around the circle. The tail is the substrate used for l packaging.

4. Morphogenesis and Lysis

The result of l lytic infection is the production, within about 60 minutes, of approximately 100 new phage particles per infected cell. As DNA replication progresses, the replicated DNA can also be used as templates for transcription. Thus there is an accelerating pace of late protein production as infection proceeds, and this production of ingredients only stops when the cell lyses. Indeed, l mutants blocked in lysis can accumulate as many as 1000 mature phage particles per cell. At some point, everything is ready for phage particle assembly to begin.

In broad outline, the construction of a l particle involves the separate assembly of tails and empty, somewhat shrunken heads, called proheads. The proheads expand approximately twofold in volume as they are filled with DNA, which is excised from the concatemer by cutting at the cos sites by the phage-encoded terminase protein, in coordination with packaging. The first cut occurs at a randomly chosen cos site on the concatemer. The two ends thus produced are different; they are (i) l’s "right" end (the Rz end of the genetic map) carrying the single-stranded DNA extension 5′ AGGTCGCCGCCC and (ii) its "left" end (the Nu1 end of the genetic map) carrying the single-stranded DNA extension 5′ GGGCGGCGACCT. After cutting, terminase remains bound to the l left end, which is then brought to the portal of a prohead by interaction between terminase and the prohead. Translocation of the DNA into the prohead occurs with the consumption of an estimated 1

ATP per 2 base pairs packaged. When translocation brings the next cos site to the portal, it too is cleaved by the terminase, which then releases the completed head and remains bound to the l left end generated at the second cleavage site. The completed head is matured by the addition of a tail, while the DNA end, bound by terminase, may begin filling another prohead. Two features of packaging are of general interest. First, cos is the only specific sequence required in the DNA to allow it to be packaged, a fact that has been exploited in the construction of artificial plasmid vectors that can be packaged into phage particles in vitro and in vivo (see Cosmid). Also, the distance along the DNA between cos sites determines the amount of DNA per phage, in contrast to phages that use "headful" packaging (see P22 phage). Luckily, l is somewhat forgiving, readily giving rise to functional particles with as little as 75% or as much as 105% of the normal 48.5 kbp of DNA. This fact makes it possible to replace nonessential l genes with DNA from other sources without having to make up the amount of replaced DNA precisely (see Cloning).

Phage l encodes two proteins that together effect the death and lysis of the infected cell. One of these, encoded by gene S, is called "holin," a protein that spans the inner membrane and is believed to form an aqueous channel to the periplasm. The holin channel permits the escape to the periplasm of endolysin, the gene R product, which rapidly digests the peptidoglycan layer, causing lysis. An enduring question about lysis is the timing mechanism that prevents premature lysis. In l, genes R and S are transcribed from Pr’, at the same time as the genes involved in morphogenesis, and the corresponding proteins are made at the same time as other late proteins, yet lysis is delayed until about 45 minutes into infection. It has been suggested that the activity of the S protein may be regulated (6).

Next post:

Previous post: