Biomedical Engineering Reference
In-Depth Information
Figure 1 . RNA secondary structure of a 5S ribosomal RNA. Secondary structure graph (left),
mountain representations (middle), dot plot (right), and bracket notation (bottom). In the
"mountain representation" each base pair ( i , j ) is represented by a bar from i to j . In the upper
right half of the dot plot, every possible base pair ( i , j ) is represented by a square in row i and
column j , with area proportional to its probability in thermodynamic equilibrium p ij ; the lower
half of the plot only shows those pairs that are part of the optimal structure. In the bracket
notation a secondary structure is encoded by a string of dots and brackets, where dots represent
unpaired bases and matching brackets represent base pairs. In all representations base pairs of
the three arms of the structure are color coded for easier comparison.
structures can be predicted with reasonable accuracy, and have proven to be a
biologically useful description.
A secondary structure of a given RNA sequence is the list of (Watson-Crick
and wobble) base pairs satisfying two constraints: (1) each nucleotide takes part
in at most one base pair, and (2) base pairs do not cross, i.e., there are no knots
or pseudo-knots. While pseudo-knots are important in many natural RNAs
(145), they can be considered part of the tertiary structure for our purposes. Sec-
ondary structure can be represented in various equivalent ways (see Figure 1).
The restriction to knot-free structures is necessary for efficient computation
by means of dynamic programming algorithms (55,56,95,98,119,140,149,153-
155). The memory and CPU requirements of these algorithms scale with se-
quence length n as '( n 2 ) and '( n 3 ), respectively, making structure prediction
feasible even for large RNAs of about 10000 nucleotides, such as the genomes
of RNA viruses (57,64,148). There are two implementations of various variants
of these dynamic programming algorithms: the mfold package by Michal Zu-
ker, and the Vienna RNA Package by the present authors and their collabo-
rators. The latter is freely available from http://www.tbi.univie.ac.at/.
These thermodynamic folding algorithms are based on an energy model that
considers additive contributions from stacked base pairs and various types of
loops; see e.g. (92,137). Two widely used methods for determining nucleic acid
Search WWH ::




Custom Search