Restriction-Modification Systems (Molecular Biology)

Restriction-modification (R-M) systems in bacteria consist of pairs of enzymes that presumably provide protection from exogenous foreign DNA, for example, from infecting bacteriophage. This phenomenon was observed almost 50 years ago, when some strains of bacteria, those with an R-M system, were observed to be less susceptible to bacteriophage infection and propagation than others. Their infection by bacteriophage was "restricted"; titers of bacteriophage produced were one to five orders of magnitude lower than in other strains, which lack an R-M system (1). Those bacteriophages that did propagate on the restricted hosts, however, were subsequently "immune" to the restriction and were competent to grow normally on those host strains.

The R-M systems act to prevent or to "restrict" the entrance of foreign DNA into the host bacterial cell by fragmenting it enzymatically with the restriction endonuclease, or restriction enzyme. The foreign DNA is recognized because it is not methylated, or "modified," whereas the DNA of the host is, due to the action of a host methyltransferase (see Methylation, DNA). Over 25% of 10,000 bacterial strains surveyed have been found to contain at least one R-M modification system; some strains contain as many as six (1, 2). R-M systems are dispensable, as bacteria without them are viable and display no deficiencies other than greater sensitivity to bacteriophage infection. The genes for R-M systems can be located on the chromosome, on a plasmid, or on a prophage, and those for each pair of endonuclease and methyltransferase are usually found in proximity. More than 160 R-M systems have been cloned, and most of their genes have been sequenced (3).


1. R-M Enzymes

The R-M phenomenon is explained by the action of the two enzymatic activities making up each R-M system. One activity is a restriction endonuclease, or restriction enzyme, which cleaves DNA, whereas the other is a methyltransferase, or methylase, which methylates the host DNA. These two activities can occur on two separate proteins or on an enzyme complex displaying both activities. The restriction enzyme recognizes a specific DNA sequence (usually 4-8 bp long) and catalyzes phosphodiester bond cleavage at sites within or outside of the sequence to produce a double-strand cut in the duplex DNA. The enzyme thus fragments the DNA, which makes it susceptible to further degradation by general exonucleases. The DNA recognition sequence, or "restriction site," can be unique or degenerate (representing more than one unique sequence), continuous or interrupted by unrecognized base pairs, palindromic or asymmetric (see Staggered Cut). The partner modification methyltransferase activity recognizes the same sequence as its cognate restriction enzyme and catalyzes the transfer of a methyl group from the cofactor S-adenosyl-L-methionine (AdoMet) to specific recipient bases in the sequence; this methylation prevents cleavage by the partner endonuclease. Sequences methylated on either one or both strands are resistant to endonuclease cleavage. The methyltransferases fall into two major mechanistic classes: one group methylates the 5-carbon of cytosine to form 5-methylcytosine (5mC) and the other the exocyclic amino group of either cytosine to form #-4-methylcytosine (N4mC) or of adenine to produce #-6-methyladenine (N6mA) (see Methylation, DNA).

2. R-M System Nomenclature

R-M systems are named with an italicized three letter abbreviation that is derived from the name of organism in which that R-M system resides. The first italicized letter (uppercase for protein and lowercase for gene designations) comes from the genus, the lowercase second and third letters from the species. Any strain designation follows the italicized abbreviation, and Roman numerals are used to designate different R-M proteins within the same organism. For example, EcoRI, EcoRII, and EcoRV are different systems from Escherichia coli strain R. For the enzymes, the letters R or M, followed by a dot (•) are used before the name: R*Eco RI or M*EcoRI are the cognate enzyme pair, endonuclease and methylase, respectively, from E. coli strain R. For the genes of R-M systems, the name designation is completely italicized and followed with the letter (R) or (M) to designate the activity, eg, ecoRIM and ecoRIR, for the methylase and endonuclease genes, respectively (2, 4).

3. R-M Gene Expression and Organization

Expression of the genes for R-M systems is assumed to be tightly regulated. The host genome carrying an R-M system must be continuously protected from its restriction endonuclease even during physiologically difficult times, such as intermittent starvation with concomitant decreases in AdoMet concentration. The genes for R-M systems are spatially linked and tandemly organized. The adjacent genes can be arranged (1) in parallel, with the 5′ end of one gene following the 3′ end of the other; (2) convergently, with the 3′ ends of the genes in proximity; or (3) divergently, with the 5′ ends of the genes near one another. Comparisons of the sequences of R-M enzymes have provided information about their evolutionary relationships (5), as well as the means of their classification. Three different types of R-M systems (I, II, and III) are recognized, and different assemblies of proteins exist for each system (Table 1).

Table 1. Restriction Modification Systems

Type

System

I

II

IIs

III

Enzymes

3 subunits:

Independent

Independent

2 subunits:

R, M, S (R2M2S)

subunits:

subunits:

R, M (RM)

R and M

R and M

proteins (R2

proteins (R

and M)

and M)

Cofactors

Restriction:

Restriction:

Restriction:

Restriction:

Mg2+ and ATP;

Mg2+

Mg2+

Mg2+, ATP is

AdoMet is

allosteric

allosteric effector

Modification: AdoMet

Modification: AdoMet

effector; stimulated by AdoMet

Modification: AdoMet, stimulated by ATP

Modification: AdoMet, stimulated by ATP

Recognition site

Asymmetric2, 35 half-sitesc with spacer region

Palindromic-4-8 bp

Asymmetric, 4-7 bp

Asymmetric, 56 bp

Cleavage site

Variable distances from recognition site,

ATP-driven translocation

Within

recognition

sequence

Under 20 bp 3′ of

recognition site

25-30 bp 3′ of

recognition site

Modification properties

Opposite strands of each half-site

Within

recognition

sequence

Methylation of one or both strands

One strand of each

recognition site

(M2S)

Prefer

hemimethylated sites over nonmethylated sites

Sequence

Conferred by S subunit R and M mutually exclusive

Intrinsic to enzyme

Intrinsic to enzyme

Conferred by M subunit

Specificity Enzymatic

Separate

Separate

Simultaneous if all cofactors present

Activities

Example

Eco AI:

EcoRI:

FokI:

HinfIII:

GAG(N7)GTCA

GAATTC

GGATG(N9)

CGAAT

CTC(N7)CACT

CTTAAG

CCTAC(N13) GCTTA

a Recognition sites not having the same sequence in both strands of DNA. b Recognition sites having the same sequence in both strands of DNA.

c A site containing only half of the necessary DNA sequence for complete enzyme recognition.

3.1. Type I R-M Systems

Fewer than 20 type I R-M systems have been found, but they are present in E. coli, Salmonella typhimurium, and C. freundii. Type I systems contain both the restriction and modification functions in the same multisubunit enzyme. The enzyme is at least a pentameric complex (R2M2S) consisting of three nonidentical subunits, referred to as R (for restriction), S (for DNA sequence specificity), and M (for methylation). The enzymes require Mg , AdoMet, and ATP for activity, or as allosteric effectors (6). If the restriction sequence is fully methylated, ATP hydrolysis drives the dissociation of the enzyme from DNA. If the restriction site is hemimethylated, the enzyme methylates the other strand and dissociates. If unmethylated, the R subunit cleaves at random DNA sites after ATP-driven translocation of the enzyme to variable distances (up to 10 bp) from the recognition sequence (6, 7). The asymmetric recognition sequence comprises two half-sites, each 3-5 bp long, separated by a nonspecific sequence of 6-8 bp. In all type I enzymes, the methyltransferase subunit forms specific N6mA residues at each half-site, preferring hemimethylated DNA substrates over unmodified ones (8). The S subunit provides DNA sequence recognition. The type I systems are grouped into three different families: IA, IB, and IC, which are differentiated by their genetic location, immunological cross-reactivity, and gene sequences. The genes of the IA and IB families are located on the chromosome and have the gene order hsdR, hsdM, and hsdS (hsd indicates host specificity determinant). The genes of the IC family are encoded on a plasmid in the order hsdM, hsdS, and hsdR. All families of the type I R-M systems have two adjacent transcriptional units, with the hsdM and hsdS genes transcribed from one promoter and hsdR from the other. Although type I enzymes (such as EcoB and EcoK) were the first restriction endonucleases to be discovered, type II enzymes have proved more numerous, simpler in composition, and more useful in practical applications.

3.2. Type II R-M Systems

More than 2600 Type II systems, with greater than 230 different specificities exist (3). Type II systems have discrete restriction and modification enzymes. Each enzyme of a pair recognizes the same DNA sequence. Most type II enzymes recognize palindromic, duplex DNA sequences, such as GAATTC, whose complementary strand has the same 5′-3′ sequence. The restriction endonucleases are homodimeric, require Mg , and cleave phosphodiester bonds within or immediately adjacent to (type IIs; see below) the recognition sequence to leave a staggered or blunt double-strand cut (see Staggered Cut and Restriction Enzymes). The methyltransferases are monomeric and require the cofactor AdoMet. Methylation takes place on both strands of the DNA duplex within the recognition sequence, rendering the sequence refractory to cleavage. Interestingly, little amino acid sequence similarity exists between partner endonucleases and the methyltransferases, suggesting that the enzymes evolved independently (5, 9) and reflecting the fundamentally different chemical reactions they catalyze and the mechanisms they use for DNA recognition. Those type II R-M systems in which the two genes are aligned consecutively are believed to operate as single transcriptional units, with the order of the genes being unimportant. Those with the genes arranged divergently or convergently may be subject to independent transcriptional control. The genes for a type II R-M system can be located on the chromosome or on a plasmid. Some type II systems also contain an open reading frame encoding a "controller" or "C" protein, with sequence similarity to some DNA-binding proteins. These controller proteins probably regulate the expression of the genes. For example, disruption of the BamHIC gene in the BamHI R-M system leads to an increase in modification and a decrease in restriction cleavage (2).

3.3. Type IIs Enzymes

A subset of the type II R-M systems, termed type IIs, share the property of having independent restriction and modification enzymes, but they recognize uninterrupted and asymmetrical sequences (4-7 bp long) and cleave DNA 3′ to the recognition sequence, up to 20 bp away, leaving a staggered double-strand cut (see Staggered Cut). These endonucleases act as monomers and require Mg for activation and cleavage. Type IIs target sequences are asymmetric, and the methyltransferases act as monomers to methylate one strand at a time within the recognition sequence. In some type IIs systems, methylation of both strands is performed by a pair of methyltransferases, which may (eg, HgaI) or may not, (eg, Alw26I) be of the same class (5mC, N4mC, or N6mA) (see Methyltransferase, DNA). In another system, FokI, methylation is accomplished by a fused, bifunctional enzyme (2). There are over 80 characterized type IIs enzymes, representing over 35 specificities (6). Their sequence specificities are similar to those of type III enzymes.

3.4. Type III R-M Systems

Only a few type III R-M systems are known, with four different specificities identified (6). A single bifunctional enzyme catalyzes both the endonuclease and the methyltransferase restriction activities. The enzymes are composed of two nonidentical subunits: the M subunit (encoded by the mod gene) and the R subunit (encoded by the res gene). The R subunit must be complexed with the M subunit for restriction activity, because the M subunit provides the sequence specificity for the enzyme. The two enzymatic activities compete for the uninterrupted, asymmetric DNA recognition sequence, which is usually 5-6 bp long. Restriction activity requires that two copies of the recognition site be proximal on the DNA and in opposite orientations to one another. Such an array of sites may also be viewed as one symmetrical sequence separated by a spacer region of undefined length (10). Type III restriction activity is absent when either a single site or two recognition sites in the same orientation are present. Cleavage takes place 25-30 bp away to the 3′ side of the DNA recognition sequence. If one or both of the DNA strands is (are) methylated, no cleavage occurs. The M subunit can act independently as a methyltransferase, requiring AdoMet and methylating only one strand of the duplex recognition sequence at a time, which is sufficient to inhibit the restriction reaction. Methylation is independent of the number and orientation of the restriction sites, suggesting that the enzyme reacts with single sites only (10).

3.5. Other R and M Systems

Other variations of restriction and modification systems exist. For example, endonucleases that recognize and cleave only methylated DNA have been described. Three such methylation-dependent restriction systems have been isolated from the K12 strain of E. coli: (1) Mrr (methyladenine recognition and restriction) cleaves DNA with sequences containing N6mA and 5mC bases; (2) McrA (modified cytosine restriction) cleaves sequences containing 5mC; and (3) McrBC cleaves DNA containing 5mC, N4mC, or 5-hydroxymethylcytosine (11). No corresponding methyltransferases are known for these systems. These systems are thought to restrict foreign methylated DNA, and their presence must be recognized when trying to clone methylated DNA. Among the type II enzymes, R*DpnI cleaves only at methylated GmATC sites, whereas DpnII cleaves only at unmethylated GATC sites. These and other enzymes with similar properties can be used to assess the methylation state of DNA.

Restriction-independent methylases have also been found, such as the adenine-specific Dam and DNA-cytosine (Dcm) methyltransferases. The Dam enzyme methylates the adenine in the sequence GATC and has various functions in methyl-directed mismatch repair, DNA replication, and gene regulation (12). Dcm functions in very short patch DNA repair, which serves to repair deaminated cytosine bases using the G-containing template of the complementary strand of the DNA as a guide (2, 5). Finally, intron-encoded endonucleases that catalyze methylation-independent restriction have been described. These enzymes recognize large specific DNA sequences ( 18bp) and cleave within the recognition sequence or up to 20 bp away. They are believed to function in site-specific intron transposition and share some characteristics with type I restriction endonucleases (6).

4. Antirestriction Systems

Bacteriophage have evolved diverse antirestriction mechanisms to defeat the R-M systems of bacteria. Such evasive measures include (1) production of phage-encoded proteins that inhibit host R-M enzymes or destroy R-M cofactors, (2) stimulation of the host modification function, (3) phage self-modification of DNA (using modified bases in their genomes), and (4) evolutionary elimination of restriction sites from the phage genome. As examples, phage T3 contains the gene for the enzyme AdoMet hydrolase, which destroys AdoMet; the Ral protein produced by phage l inhibits restriction and stimulates the methylation function of type IA enzymes (11). The T-even phages carry glycosylated hydroxymethylcytosine bases in an effort to evade restriction (5). On the other hand, a restriction enzyme encoded on bacterial plasmid RtsI, /VuRtsI1, specifically cleaves only phage DNA sequences containing hydroxymethylcytosine residues (13). Evolution has clearly promoted the development of both defensive and offensive mechanisms in the competition between bacteria and bacteriophage.

5. Concluding Remarks

R-M systems probably provide an "immune system" for bacteria that shares characteristics with that of eukaryotes (see Immune Response). For example, foreign, "non-self-"DNA is distinguished from "self-"DNA (11). A provocative, alternative hypothesis for the existence of R-M systems suggests that they evolved because of the "selfishness" of their genes. The sociobiologically derived "selfish gene" theory dictates that under certain circumstances natural selection accommodates proliferation of genes that are potentially deleterious to the host carrying them (14). In support of this view, two groups have shown that a daughter bacterial cell not receiving the plasmid encoding the R-M system of the parent is subject to chromosomal DNA cleavage by residual restriction endonuclease remaining after cell division if the residual methylase activity cannot protect all the restriction sites efficiently (15, 16). Thus, selection for progeny carrying the R-M system exists in the absence of their "immunity" role. Other rationalizations for the existence of R-M systems include roles for them in DNA recombination or repair, regulation of gene expression, and acquisition of foreign genes, but these roles are less certain (2).

The discovery of R-M enzymes has revolutionized several areas of research and biotechnology (see Cloning). Isolation and characterization of R-M genes and purification of the proteins have led to greater understanding of protein-DNA and enzyme-cofactor interactions. For example, a novel example was found in the X-ray crystallography structures of two type II 5mC methyltransferases with their target DNA: an extrahelical cytosine base is flipped out of the DNA into pockets in the enzymes (see 5-Methylcytosine). The Type II restriction enzymes provide a vast array of sequence-specific DNA cleavage tools that allow manipulation and manageable analysis of an otherwise formidable macromolecule. Several applications of restriction endonucleases and methyltransferases are discussed in the entries Restriction Fragment, Restriction Map, and Staggered Cut. R-M systems provide ideal models for studying basic genetic and biochemical mechanisms, as well as serving as practical tools for expanding other areas of biological research.

Next post:

Previous post: