RNA Sequencing (Molecular Biology)

The majority of RNA sequences have been obtained by sequencing of either cloned DNA for RNA genes or complementary DNA prepared by transcription of RNA with reverse transcriptase. Such sequences of RNA genes do not, however, provide information about post-transcriptional modification of RNA. In addition to RNA splicing, cleavage, addition of cap, polyadenylation, and RNA Editing, the primary transcripts of eukaryotic and prokaryotic RNAs are often processed by specific modification of the nucleotide residues. Almost 100 base and ribose modifications of the standard four ribonucleotides are known (1), and they are located primarily in the functionally important regions of RNAs. Consequently, rapid methods for RNA sequencing, similar to those for DNA sequencing, have been developed, in addition to special chemical, enzymatic, and spectrophotometric techniques to identify modified nucleotides in RNA.

1. Sequencing Oligoribonucleotides

Methods to sequence RNA were developed almost two decades before rapid DNA sequencing technology became available. The early sequencing of transfer RNA used column chromatography to separate oligoribonucleotides. Guanine-specific ribonuclease T1 (RNase T1) from Aspergillus oryzae and pyrimidine-specific ribonuclease A (RNase A) from bovine pancreas were used to achieve alternative cleavages of RNA to oligoribonucleotides. Spectroscopic characterization, determination of nucleoside composition, and stepwise degradation provided partial sequences from which the sequence of the original RNA was derived. Two-dimensional separation by high-voltage electrophoresis and chromatography of oligoribonucleotides obtained by ribonuclease cleavage, so-called fingerprinting (2)—considerably increased the resolution and speed of RNA sequencing. Two oligoribonucleotides that differ by one nucleotide in length have a characteristic mobility shift in the two-dimensional fingerprint depending on the nucleotide by which they differ. These shifts depend mainly on the pKa values of the particular nucleotide residue. After partial cleavage of end-labeled oligoribonucleotides, followed by two-dimensional separation and chromatography, a sequence of 815 nucleotides could be read in one experiment (Fig. 1) (3).


Figure 1. Two-dimensional separation of radioactively end-labeled oligoribonucleotides obtained by limited cleavage by an endonuclease. First dimension: electrophoresis; second dimension: chromatography. The shift in mobility on removal of each 3′ -terminal nucleotide provides its identity (3).

Two-dimensional separation of radioactively end-labeled oligoribonucleotides obtained by limited cleavage by an endonuclease. First dimension: electrophoresis; second dimension: chromatography. The shift in mobility on removal of each 3' -terminal nucleotide provides its identity (3).

2. Rapid Sequencing by Electrophoresis of End-labeled RNA on Polyacrylamide Gels

The methods developed for rapid sequencing of DNA have been modified for RNA sequencing. 5′ -End-labeling is usually achieved by dephosphorylation of RNA with alkaline phosphatase, followed by phosphorylation with phage T4 polynucleotide kinase and [g- P]ATP (4). Alternatively, the 3′ – end of an oligoribonucleotide can be phosphorylated by cytidine-3 ,5 [ P] diphosphate ([ P]pCp) and T4 RNA ligase in the presence of ATP (5). Limited base-specific cleavage of end-labeled RNA can be achieved either enzymatically (6, 7) or chemically (8) (Table 1). The oligoribonucleotides generated are separated by electrophoresis, and the bands with the label are detected by autoradiography. For comparison, a nonspecifically cleaved sample is usually prepared by hydrolysis of an end-labeled RNA.

Table 1. Cleavage Reactions for RNA Sequencing

(a) Enzymatic cleavage (6, 7)

Ribonuclease (RNase)

Specificity

Product

RNase T1 from Aspergillus oryzae

tmp127-15 tmp127-16

RNase U2 from Ustilago sphaerogena

tmp127-17 tmp127-18

RNase Phy M from Physarum polycephalum

tmp127-19 tmp127-20
tmp127-21 tmp127-22

RNase A from bovine pancreas

tmp127-23 tmp127-24
tmp127-25 tmp127-26

RNase from Bacillus cereus

tmp127-27 tmp127-28
tmp127-29 tmp127-30

(b) Chemical cleavage (8)

Reagent

Cleavage Specificity

Dimethylsulfate, NaBH4

Aniline (pH 4.5) G

Diethylpyrocarbonate

Aniline (pH 4.5) A > G

Hydrazine

Aniline (pH 4.5) U > C

Hydrazine in 3M NaCl

Aniline (pH 4.5) C > U

A serious disadvantage of the rapid sequencing methods is their inability to detect modified nucleotides in RNA. Methods to detect them were developed that combine polyacrylamide gel electrophoresis (PAGE) separation of oligoribonucleotides according to their length, with chromatographic identification of the radioactively labeled nucleotide present at the cleavage site (9, 10). These methods were especially successful in determining the sequences of tRNAs. A mixture of oligoribonucleotides is prepared by nonspecific cleavage in aqueous solution in the presence of formamide at elevated temperature. The 5′ -hydroxyl groups of the oligoribonucleotides formed by this cleavage are then labeled by T4 RNA kinase and [g- P]ATP. After separation by gel electrophoresis, bands of the different oligoribonucleotides are blotted onto chromatographic plates and digested in situ to mononucleotides by RNase T2. For each band, the terminal P-labeled nucleotide is identified separately by one- or two-dimensional chromatography (Fig. 2).

Figure 2. Sequencing of RNA by the post-labeling method. Limited alkaline hydrolysis leads to mixture of oligoribonucleotides, which are end-labeled and then separated by polyacrylamide gel electrophoresis (PAGE; first dimension) and by chromatography (second dimension). After hydrolysis of each fragment by ribonuclease T2, the identity of the labeled nucleotide at the terminus is identified by chromatography (9).

Sequencing of RNA by the post-labeling method. Limited alkaline hydrolysis leads to mixture of oligoribonucleotides, which are end-labeled and then separated by polyacrylamide gel electrophoresis (PAGE; first dimension) and by chromatography (second dimension). After hydrolysis of each fragment by ribonuclease T2, the identity of the labeled nucleotide at the terminus is identified by chromatography (9).

Sequencing of RNA by primer extension using AMV reverse transcriptase (11) is very similar to the common enzymatic Sanger method of DNA sequencing with chain-terminating inhibitors. A synthetic oligodeoxyribonucleotide complementary to part of the RNA to be sequenced is used as a primer for DNA synthesis in the presence of radioactively labeled nucleoside-5′ -triphosphates and one of the four nucleoside-5′ -triphosphate dideoxynucleotide analogues. The sequence is then read from the ladder produced by PAGE. More common, however, is transcription of the RNA to DNA, cloning, and determination of the sequence by standard DNA-sequencing technology. Reverse transcription is now used primarily for identification of positions occupied by bulky modified nucleotides, where the reverse transcriptase frequently terminates DNA synthesis. The corresponding positions are then detected as enhanced bands on the PAGE sequencing ladder. The primer extension method was successfully used to identify regions of RNA susceptible to chemical reagents (12) or to identify modified nucleotides in ribosomal RNA (13).

3. Identification of Modified Nucleotides in RNA

Rapid sequencing of RNA may provide hints to the presence of modified nucleotides in RNA, because the bands of oligonucleotides terminating there may be irregular, missing, or enhanced on the sequencing gels. This is due to the different reactivities of modified nucleotides to chemical reagents and ribonucleases. However, analysis by thin-layer chromatography (4) or HPLC (14) and comparison with synthetic standards still is the most common way of detecting modified nucleotides in RNA (Fig. 3).

Figure 3. Identification of modified nucleoside-5′ -phosphates by two-dimensional chromatography (10). RNA is digested by nuclease P1 and the nucleotides separated by first dimension: electrophoresis; second dimension: chromatography. i6 A denotes A^-isopentenyladenosine; Um, 2 ‘ -O-methyluridine; m6A, N’-methyladenosine; s4U, 4-thiouridine; m1 A, 1-methyladenosine; Gm, 2 ‘ -O-methylguanosine; m5C, 5-methylcytidine, Y, pseudouridine; D, 5,6-dihydrouridine; m7G, 7-methylguanosine; T, 5-methyluridine.

 Identification of modified nucleoside-5' -phosphates by two-dimensional chromatography (10). RNA is digested by nuclease P1 and the nucleotides separated by first dimension: electrophoresis; second dimension: chromatography. i6 A denotes A^-isopentenyladenosine; Um, 2 ' -O-methyluridine; m6A, N'-methyladenosine; s4U, 4-thiouridine; m1 A, 1-methyladenosine; Gm, 2 ' -O-methylguanosine; m5C, 5-methylcytidine, Y, pseudouridine; D, 5,6-dihydrouridine; m7G, 7-methylguanosine; T, 5-methyluridine.

Considerable progress has been achieved in analysis of modified nucleotides by coupling liquid chromatography and electrospray ionization mass spectrometry. This method was used successfully to analyze modified nucleosides in tRNA and ribosomal RNA (15, 16). In view of the rapid progress in instrumentation, the limits of this method have not yet been defined.

Next post:

Previous post: