RNA is structurally more versatile than DNA and has more diverse biological roles. It is an informational molecule that serves as genetic material and the mediator of genetic information from DNA to protein, it also plays a vital structural role in many ribonucleoprotein particles and, as is now recognized, can itself have catalytic activity. Whereas in the hypothetical RNA world, RNA would have acted unaided by other macromolecules, RNA almost invariably functions in modern organisms associated with RNA-binding proteins, a wide variety of which have now been identified. Although many RNA-binding proteins are uncharacterized structurally and do not fall into known categories, primary sequence analysis has led to the identification of a number of recurring RNA-binding motifs in functionally diverse RNA-binding proteins (1, 2). Interestingly in most cases, these are structurally and therefore probably evolutionarily distinct from DNA-binding motifs. RNA molecules are classified into various types based on their function or localization, such as transfer RNA (tRNA), messenger RNA (mRNA), ribosomal RNA (rRNA), viral RNA (vRNA), small nuclear RNA (snRNA), and small cytoplasmic RNA (scRNA). Table 1 gives some examples of RNA-binding proteins associated with each form, and Table 2 summarizes with examples the known RNA-binding motifs. Understanding how RNA-binding proteins specifically interact with their target RNA to form functional complexes is a key problem in structural biology, that has ramifications throughout molecular biology from transcription to protein biosynthesis and from replication of viruses to development of embryos and genetic disease .
Table 1. Selected RNA-Binding Proteins, their RNA Targets, and their Functions
RNA tRNA |
RNA binding proteins tRNA (m5U54)methyl-transferase |
Function General tRNA modification |
tRNA-guanine transglycosylase Specific tRNA modification |
||
aminoacyl-tRNA synthetase |
tRNA aminoacylation |
|
met- |
Prokaryote translation |
|
tRNAfmetformyltransferase |
initiation |
|
IF3 |
Prokaryote translation initiation |
|
EF-Tu |
tRNA transport to ribosome |
|
M1 RNA |
C5 protein |
Subunit of E. coliRnase-P endoribonuclease |
mRNA/hnRNA RNA polymerase II |
Transcription |
|
GreA |
Transcription termination |
|
hnRNP proteins |
Pre-mRNA binding |
|
Sex-lethal |
Pre-mRNA splicing |
|
mRNA capping enzyme |
5′ capping enzyme |
|
E. coli RNase III |
RNA processing |
|
ds RNA adenosine deaminase |
mRNA editing |
|
CBP20, CPB80 |
Nuclear cap-binding protein subunits |
|
eIF-4F |
Cytoplasmic cap binding protein |
|
Poly-A-polymerase, CStF |
3′ poly-adenylation |
|
Poly-A-binding protein (PABP) mRNA stability |
||
aCP-1, aCP-2 |
a-Globin mRNA stability |
|
Staufen |
mRNA localization |
Endo-, exoribonucleases |
mRNA degradation |
|
IRE-binding protein, E. coli |
Translational regulation |
|
Threonyl-tRNA synthetase, T4 regA |
Translational regulation |
|
Thymidylate synthase |
Translational regulation |
|
rRNA |
RNA polymerase III |
rRNA transcription |
Nucleolin |
rRNA processing |
|
Nucleolar 2 -O- |
rRNA modification |
|
methyltransferase |
||
TFIIIa |
5 s rRNA storage |
|
Sx, Lx |
Small and large subunit |
|
ribosomal proteinss |
||
snRNA |
B, B’, D1, etc. |
Core snRNP proteins (Sm- |
proteins)s |
||
U1A, U2B’, etc. |
Specific SnRNP proteinss |
|
Human RNA helicase A |
RNA helicases |
|
vRNA |
RNA-dependent RNA |
Replications |
polymerase |
||
Viral nucleocapsid proteins |
Assembly, protections |
|
MS2 coat protein |
Viral capsid, translational |
|
repressors |
||
dsRNA-dependent protein |
Translation regulations |
|
kinase |
||
Reverse transcriptase |
Copying RNA to DNAs |
|
RNA ligase |
DNA ligase |
Ligating RNAs |
HIV-Tat |
Transcriptional activators |
|
scRNA |
||
SRP RNA |
Signal recognition particle |
Targeting of secretory |
proteins |
proteinss |
|
gRNA |
TUTase |
mRNA editings |
Table 2. Various RNA-Binding Domains and Sequence Motifs2‘ -
Representative Proteins and Reference |
||
Sequence Motif |
Describing Structure (in parentheses) |
Target RNA |
RNP domain |
U1A spliceosomal protein (7) |
U1 snRNA |
(9,10) |
U1A protein |
|
mRNA |
||
(RNA |
U1 70k spliceosomal protein (1,6) |
U1 snRNA |
recognition |
||
motif) |
||
hnRNP protein C (1) |
mRNA |
|
precursor |
||
mRNA |
hnRNP protein A1a |
precursor |
|
Poly(A)-binding protein (1) |
mRNA poly(A) |
|
tail |
||
dsRNA-binding |
E. coli RNase III (12) |
RNA transcript |
motif |
||
Drosophila Staufen (11) |
Maternal bicoid |
|
mRNA |
||
KH domain |
hnRNP protein K (1) |
mRNA |
precursor |
||
Fragile X protein (13) |
Unknown |
|
Vigilin (14) |
tRNA? |
|
Zn-finger |
TFIIIA (18) |
5 S rRNA |
S1 domain |
Polynucelotide phosphorylase (15) |
|
Ribosomal protein S1 |
mRNA |
|
Initiation factor 1 |
||
Sm domain |
Spliceosomal core proteins (2) |
snRNAs |
OB domain |
Class IIb aminoacyl tRNA synthetase (19- |
tRNA |
21) |
||
RGG box |
hnRNP proteins (1) |
mRNA |
precursor |
||
Alu binding |
SRP 14 / 9 heterodimer |
Alu domain of |
module |
SRP RNA |
Some of the important questions concerning the nature of protein-RNA recognition can be illustrated by considering tRNAs (see Transfer RNA). They function as adaptor molecules that specify amino acid residues corresponding to particular triplet codons. There are 20 aminoacyl-tRNA synthetases, one for each amino acid, and each of these enzymes must specifically recognize and aminoacylate only its cognate tRNA with its cognate amino acid (3, 4). In contrast, the prokaryotic elongation factor Tu (EF-Tu), which introduces aminoacylated tRNA (aa-tRNA) into the A-site of the ribosome, binds to all aa-tRNA (except initiator-Met tRNA and selenocysteinyl-tRNA) and hence must recognize features common to all elongator aa-tRNAs (5). These are two extreme examples of tRNA recognition mechanisms that are essential in maintaining the fidelity and efficiency of protein biosynthesis. The various tRNA are transcribed as precursor RNA molecules and must be processed correctly by specific nucleases and a surprisingly large number of different base-modifying enzymes to form mature molecules. These modifying enzymes also have diverse specificities. Some recognize almost all tRNA [eg, tRNA (m 5U54)methyltransferase]. Others recognize a subset of tRNA (eg, tRNA-guanine transglycosylase), even a unique tRNA species.
A wide variety of RNA-binding proteins are involved in producing, processing, transporting, translating, and degrading mRNA, the most abundant single-stranded RNA in cells. mRNA in eukaryotes is capped at the 5′ end and polyadenylated at the 3′ end, and introns must be correctly excised by the splicing machinery. Mature mRNAs are transported from the nucleus to the cytoplasm for translation. Gene expression is regulated at the transcriptional level and also at the translational level by a variety of mechanisms mediated by mRNA-binding proteins. The controlled degradation of mRNA is also an important process, involving a complex set of endo- and exonucleases. All of these processes must interact precisely with a large number of specific or nonspecific RNA-binding proteins and enzymes.
RNA is an essential structural and functional component of many ribonucleoproteins (RNP). Assembly of these RNP requires specific recognition of RNA structure and/or base identity by RNP protein subunits. The ribosome itself contains rRNA molecules, which are matured by splicing and modifying enzymes in the nucleolus in association with small nucleolar RNP (snoRNP ). The rRNA may actually play the key catalytic role in peptidyl transfer. Pre-mRNA splicing is a complicated regulated process, involving a number of small nuclear RNPs containing snRNA that form a dynamic complex known as the spliceosome. The substrate of the spliceosome is not naked RNA, but pre-mRNA (also known as heterogeneous nuclear RNA, hnRNA ) complexed with various proteins that form particles known as hnRNP. The mammalian signal recognition particle (SRP) is an scRNP consisting of a 300-nucleotide 7 S RNA associated with six protein subunits. It binds to signal peptides at the N-terminal end of secreted proteins as they emerge from the ribosome and directs the nascent-chain/ribosome complex to receptors on the endoplasmic reticulum membrane in eukaryotic cells. Systems homologous to SRPs are found in all living cells. Ribonuclease P, a tRNA-processing endonuclease, is an RNP in which the RNA component plays the catalytic role. Telomerase is a complex enzyme consisting of template RNA and protein subunits that adds multiple repeats of a particular sequence onto the ends of chromosomal DNA. Many viruses contain RNA as their genetic element, and virally-coded RNA -binding proteins play essential roles in the very diverse modes of replication and assembly of viruses.
How do RNA-binding proteins recognize their RNA-binding sites? RNA is distinguished from DNA most readily by the presence of the ribose 2 ‘ -OH and because the extended RNA double helix is predominantly in the A-form and has helical parameters significantly different from DNA, which is normally in the B-form. These general features permit nonspecific double-stranded RNA-binding proteins to recognize their correct nucleic acid substrate (eg, dsRNA-dependent protein kinase ). On the other hand, specific RNA-binding proteins have more difficulty recognizing undistorted A-form RNA, which is characterized by a deep, narrow major groove and a shallower minor groove. The bases in the major groove are not readily accessible to protein side chains for specific recognition by hydrogen bonding except near the beginning or ends of helices. By contrast, the readily accessible minor groove contains less information for base discrimination. This problem is in reality bypassed by the fact that most cellular RNA is single-stranded, which forms irregular structures comprising short helices resulting from Watson-Crick pairing of short complementary stretches interspersed with hairpins, internal loops, bulges, or pseudoknots . Many RNA-binding proteins specifically recognize irregular RNA structures whose bases are more exposed for hydrogen bonding or stacking interactions with protein side chains. Indeed the irregularities may be specifically induced or enhanced by the protein/RNA interaction. Some RNA molecules fold further into complex tertiary structures (eg, tRNA) whose unique three-dimensional shape may be sufficient to define specific backbone interactions with an RNA-binding protein (eg, seryl-tRNA synthetase recognizing tRNA Ser)