DNA REPLICATION AND ITS REGULATION (Nucleic Acid Synthesis)

A. DNA Replication

DA replication is initiated at discrete sequences called origin (ori) of replication to which DNA polymerase and accessory proteins bind and copy both strands, as predicted by the semi-conservative replication model (Fig. 2B). In contrast to unidirectional RNA synthesis, DNA replication in most genomes occurs bidirectionally (Fig. 2B). This results in both continuous and discontinuous synthesis of the same strand on two sides of the origin of replication. Some circular genomes, such as mitochondrial DNA, are replicated unidirectionally. In these cases, replication starting at the ori proceeds continuously in the 5′ ^ 3′ direction, followed by discontinuous synthesis of the complementary strand. Termination occurs at the same site as the ori after the circle is completely traversed. During replication of the mitochondrial genome, elongation of the continuous strand pauses at some distance from the ori, resulting in a bubble (6 structure) structure named a D-(displacement) loop (Fig. 4A).

The single-stranded DNA genomes of certain small E. coli viruses (such as M13 and 0X174) are replicated in the form of rolling circles in which unidirectional synthesis of one (virus genome) strand occurs by continuous displacement from the template (complementary strand; Fig. 4A). The initial duplex DNA (called the replicative form or RF) is the template for rolling circle synthesis and is formed first by replication of the single-stranded form. Such a single-stranded circular DNA template has been exploited in recombinant DNA techniques.


Small organisms (e.g., bacteria), as well as plasmids and many viruses, have only one ori sequence per cellular genome (4.7 x 106 nucleotide pairs in E. coli), which is often an uninterrupted DNA molecule (Figs. 4A and 4B). In complex organisms, with a much larger genome size (~3 x 109 nucleotide pairs for mammals), which is divided into multiple discrete chromosomes, thousands of ori sequence are present (Fig. 4C), although not all of them may be active in all cells; this requires that replication be regulated and coordinated.

B. Regulation of DNA Replication

Semi-conservative replication of the genome ensures that each daughter cell receives a full complement of the genome prior to cell division. In eukaryotes, this is achieved by the distinct phases of the cell cycle, namely, G1 phase, during which cells prepare for DNA synthesis; S phase, in which DNA replication is carried out; and G2-M (mitosis), during which the replicated chromosomes segregate into the two newly divided daughter cells. Unlike in eukaryotes, DNA replication in prokaryotes may occur continuously during growth (in rich medium). Thus, the copy number of genomes could exceed two in rapidly growing cells. In the case of viruses, which multiply by utilizing the host cell synthetic machinery and eventually killing them, genome replication may be not controlled. However, plasmid DNA, as well as the genomes of or-ganelles such as mitochondria and chloroplasts, is replicated with some degree of regulation. In these cases the genomic copy number can vary within limits as a function of growth condition.

C. Regulation of Bacterial DNA Replication at the Level of Initiation

In all organisms, as well as autonomously replicating DNA molecules of organelles and plasmids, replication is divided into three stages: initiation, chain elongation, and termination. The control of replication occurs primarily at the level of initiation of DNA synthesis at the "origin" (ori site). Because DNA chains cannot be started de novo and requires a primer, the initiation complex contains primase activity for synthesis of an RNA primer. Discontinuous synthesis of Okazaki fragments needs repeated primer synthesis for each fragment as an integral component of chain elongation. Initiation of the primer at the ori sequence rather than elongation of initiated chains is the critical event in DNA replication control.

Different replicons of prokaryotes and eukaryotes utilize distinct mechanisms which vary in complexity, depending on the complexity of the organisms. A common feature of replication initiation control in E. coli genomes and plasmids is the presence of repeats of A^T rich sequences which facilitate unwinding of DNA and one or multiple repeats of a "dnaA box" to which the initiator DnaA protein in E. coli or its functional homolog (called Rep in other cases) binds to allow helical unwinding and primer synthesis. The level of DnaA protein regulates the initiation frequency and, in turn, is controlled at the level of transcription of the dnaA gene. Thus, there are complex negative autofeedback loops to control dnaA gene expression. DnaA regulates its own gene, and its steady-state level in the cell is determined by the cellular growth state. The frequency of replicon firing is dependent on the growth rate of the bacteria. As mentioned before, rapidly growing cells can have multiple copies of the genome, while cells with a very low growth rate have only one copy. Furthermore, as expected in cells with multiple genome copies, the genes near the origin will have a higher average copy number than the genes located near the terminus of replication and, therefore, will be more transcriptionally active.

In the case of multicopy plasmids, the control of copy number is mediated by the synthesis of anti-sense RNA of the replication initiator protein Rep, which is copied from the nontranscribed DNA strand and is thus complementary to the normal RNA. Anti-sense RNA prevents synthesis of the Rep protein, which is required for initiation of DNA synthesis and whose concentration is the primary mechanism of controlling initiation frequency. Rep proteins encoded by plasmids bind to additional copies of binding sites called "iterons," often present upstream of the ori sequences in the plasmids.

D. DNA Chain Elongation and Termination in Prokaryotes

Once initiated, DNA replication proceeds by coordinated copying of both leading and lagging strands. Although both bacteria and eukaryotes have multiple DNA poly-merases, only one, named polymerase III (Pol III), is primarily responsible for replicative DNA synthesis in E. coli. In eukaryotes, DNA polymerases 5 and e have both been implicated in this process along with a suggestion that each of these two enzymes may be specific for leading or lagging strand synthesis.

Replication involves separation of two DNA strands which are catalyzed by DNA helicases which hydrolyze ATP during this reaction. ATP hydrolysis provides the energy needed for the unwinding process. All cells have multiple DNA helicases for a variety of DNA transactions.

DnaB is the key helicase for replication of the genome E. coli. However, other helicases such as Rep and PriA are also involved in replication and interact with other components of the replication complex called the replisome.

Replication requires a large number of proteins, including the holoenzyme of Pol III which includes, in addition to the catalytic polymerase cores, ten or more pairs of other subunits. The polymerase complex appears to have a dimeric asymmetric structure in order to replicate simultaneously two strands with opposite polarity. The continuous leading strand synthesis should be processive without interruption, because periodic RNA primer synthesis is not necessary once the leading DNA strand synthesis is initiated. On the other hand, the discontinuous lagging strand synthesis should not be processive, because repeated synthesis of RNA primers is required to initiate synthesis of each Okazaki fragment. The Pol III holoenzyme appears to assemble in a stepwise fashion, with its key p-subunit dimer acting as a sliding clamp based on its X-ray crystal-lographic structure of a ring surrounding the DNA. This clamp is loaded on DNA by the y-complex, accompanied by ATP hydrolysis. The dimeric structure of the replication complex is maintained by the dimeric subunit of the holoenzyme. The p-clamp slides on the duplex DNA template and thus promotes processivity. Proliferating cell nuclear antigen (PCNA) is the sliding clamp homolog in eukaryotic cells and is also used in SV40 replication.

Much of the information about the composition of the E. coli Pol III holoenzyme, and DNA chain elongation, was generated from studies of the replication of small, single-stranded circular DNAs of bacterial viruses 0X174 and M13 and also of laboratory-constructed plasmid DNA containing the ori (ori C) of E. coli. Asymmetric dimeric replication complexes have also been identified for larger E. coli viruses such as T4 with a linear genome and for the mammalian SV40 virus with a double-strand circular genome. In circular genomes, DNA synthesis is terminated at around 180° from the origin. In the case of linear genomes, termination occurs halfway between two neighboring replicons. The mechanism of termination is not completely understood. Although, in the E. coli genome, specific termination (ter) sequences are present, which bind to terminator proteins, such proteins act as anti-helicases to prevent strand separation. However, the termination may not be precise and occurs when the replicating forks collide.

E. General Features of Eukaryotic DNA Replication

Unlike the genomes in bacteria and plasmids (as well as in mitochondria and chloroplasts) which consist of a circular duplex DNA, with a single ori sequence, the genomes of eukaryotes are not only much larger and linear, but also contain multiple ori sequences for DNA replication and thus multiple replicons. Thousands ofreplicons are simultaneously fired in mammalian genomes, as is needed to complete replication of the genome in a few hours. Mammalian genomes are three orders of magnitudes larger than the E. coli genome for which one round of replication requires about 40 min at 37°C. Replication of a mammalian genome, initiated at a single ori, would thus take more than 1 week with the same rate of synthesis. In fact, it would be even longer because the rate of DNA chain elongation is slower in eukaryotes than in E. coli, possibly because of the increased complexity of eukaryotic chromatin.

As mentioned earlier, DNA replication in eukaryotes occurs only during the S phase, which can last for several hours but whose duration varies with the organism, the cell type, and also the developmental stage. For example, in a rapidly growing early embryo of the fruitfly D. melanogaster, cellular multiplication with duplication of the complete genome occurs in less than 15 min. The details of temporal regulation of firing of different repli-cons are not known. However, euchromatin regions are replicated earlier than the heterochromatin regions.

The details of initiation of replication at individual repli-cons have not been elucidated in eukaryotes. Some ori sequences of the yeast genome, known as autonomous replication sequences (ARS), have been determined. Although such sequences in the mammalian genomes have not been isolated, the ori regions of certain genes which could be selectively amplified have been localized by two-dimensional electrophoretic separation. Nevertheless, a significant amount of information has been gathered regarding regulation of DNA replication at the global level.

F. Licensing of Eukaryotic Genome Replication

Unlike in bacteria and plasmids, DNA replication in eu-karyotic cells is extremely precise, and replication initiation occurs only once in each cell cycle to ensure genomic stability. "Licensing" is the process of making the chro-matin competent for DNA replication in which a collection of proteins called origin recognition complex (ORC) bind to the ori sequences. This binding is necessary for other proteins required for the onset of the S phase to bind to DNA. ORC is present throughout the cell cycle. However, other proteins required for replication initiation and chain elongation are loaded in a stepwise fashion. The onset of the S phase may be controlled by a minichromo-some maintenance (MCM) complex of proteins which licenses DNA for replication, presumably by making it accessible to the DNA synthesis machinery. Several protein factors are involved in the loading process, which is regulated both positively and negatively. The level of regulator proteins, such as geminin, which blocks licensing, is also regulated by some cell cycle-dependent feedback mechanisms.

G. Fidelity of DNA Replication

The maintenance of genomic integrity in the form of the organism-specific nucleotide sequence of the genome is essential for preservation of the species during propagation. This requires an extremely high fidelity of DNA replication. Errors in RNA synthesis may be tolerated at a significantly higher level because RNAs have a limited half-life, even in nondividing cells, and are redundant. In contrast, any error in DNA sequence is perpetuated in the future, as there is only one or two copies of the genome per cell under most circumstances. Obviously, all organisms have a finite rate of mutation, which may be necessary for evolution. Genetic errors are one likely cause of such mutations. Inactivation of a vital protein function by mutation of its coding sequence will cause cell death. However, mutations that affect nonessential functions could be tolerated. Some of these mutations can still lead to change in the phenotype, which in extreme cases can cause pathological effects. In other cases, these may be responsible for susceptibility to diseases. In many cases, however, such mutations appear to be innocuous and are defined as an allelic polymorphism. The mammalian genome appears to have polymorphism in one out of several hundred base pairs. Such mutations obviously arose during the evolution and subsequent species propagation.

The error rate in replication of mammalian genome is about 10-6 to 10-7 per incorporated deoxynucleotide. The catalytic units of the replication machinery, namely, DNA polymerases, have a significantly higher error rate of the order of 10-4 to 10-5 per deoxynucleotide. In fact, some DNA polymerases, notably the reverse transcrip-tases of retroviruses, including HIV, the etiologic agent for AIDS, are highly error prone and incorporate a wrong nucleotide for every 102-103 nucleotides. These mistakes result in a high frequency of mutation in the viral protein, which helps the virus escape from immunosurveil-lance. The overall fidelity of DNA replication is significantly enhanced by several additional means. The editing or proof-reading function of the replication machinery is a 3′ ^ 5′ exonuclease (which is either an intrinsic activity of the core DNA polymerase or is present in another subunit protein of the replication complex) which tests for base pair mismatch during DNA replication and removes the misincorporated base. Such an editing function is also present during RNA synthesis. In addition, after replication is completed, the nascent duplex is scanned for the presence of mispaired bases. Once such mispairs are marked by mismatch recognition proteins, a complex mismatch repair process is initiated, which causes removal of a stretch of the newly synthesized strand spanning the mismatch, followed by resynthesis of the segment, as described later.

H. Replication of Telomeres—The End Game

Because DNA synthesis proceeds unidirectionally from 5′ ^ 3′ with respect to deoxyribose, by sequential addition of deoxynucleotides to the 3′ terminus of the deoxynu-cleotide added last, chain elongation can proceed to the terminus of the template strand oriented in the 3′ to 5′ direction. But how about synthesis of the terminus of the complementary strand ? Because synthesis of this discontinuous (lagging) strand occurs in the opposite direction by repeated synthesis of a primer, the terminus could not be replicated. This problem of end replication is eliminated in the circular genomes of bacteria and the small genomes of plasmids and viruses. However, in the case of linear eu-karyotic chromosome, the problem is solved by a specialized mechanism of telomere replication. Telomeres are repeats of short G-rich sequences found at both ends of the chromosomes (Fig. 6). In the human genome, the telomere repeat unit is 5′ (T/A)m Gn 3′, where n > 1 and 1 < m < 4. Telomerase is a special DNA polymerase (reverse tran-scriptase) containing an oligoribonucleotide template 5′ Cn(A/T)m3′ (which is complementary to the telomere repeat sequence) as an integral part of the enzyme (Fig. 6). In the presence of other accessory proteins, telomerase utilizes its own template to generate the telomeric repeat unit and, by "slippage," utilizes the same oligoribonucleotide template repeatedly to generate thousands of repeats of the same hexanucleotide unit sequence. Because the lagging strand terminal region does not require an external DNA template, the newly synthesized DNA is present in an extended single-stranded region. Telomeres provide a critical protective function to the chromosome by their unique structures and prevent their abnormal fusion.

I. Telomere Shortening: Linkage Between Telomere Length and Limited Life Span

One profound implication of the specialized telomere structure and its synthesis is that in the absence of telom-erase, the repeat length of telomeres could not be maintained. Telomerase is active in neonatal cells and also in some immortal tumor cells, but is barely detectable in diploid, terminally differentiated mammalian cells. Most such diploid cells can multiply in vitro in specialized culture medium, but have a limited life span. Loss of replica-tive capacity is associated with shortening of telomere repeat lengths. Furthermore, ectopic and stable expression of telomerase in human diploid cells by introduction of its gene confer an indefinite reproductive life on such cells. It is generally believed that cells will senesce if the telomere length is reduced below a critical level after repeated replication of the genome.

Next post:

Previous post: