MODELING RNA FOLDING - Complex Systems Science in Biomedicine

Biomedical Engineering Reference

In-Depth Information

Figure 1 . RNA secondary structure of a 5S ribosomal RNA. Secondary structure graph (left),

mountain representations (middle), dot plot (right), and bracket notation (bottom). In the

"mountain representation" each base pair ( i , j ) is represented by a bar from i to j . In the upper

right half of the dot plot, every possible base pair ( i , j ) is represented by a square in row i and

column j , with area proportional to its probability in thermodynamic equilibrium p ij ; the lower

half of the plot only shows those pairs that are part of the optimal structure. In the bracket

notation a secondary structure is encoded by a string of dots and brackets, where dots represent

unpaired bases and matching brackets represent base pairs. In all representations base pairs of

the three arms of the structure are color coded for easier comparison.

structures can be predicted with reasonable accuracy, and have proven to be a

biologically useful description.

A secondary structure of a given RNA sequence is the list of (Watson-Crick

and wobble) base pairs satisfying two constraints: (1) each nucleotide takes part

in at most one base pair, and (2) base pairs do not cross, i.e., there are no knots

or pseudo-knots. While pseudo-knots are important in many natural RNAs

(145), they can be considered part of the tertiary structure for our purposes. Sec-

ondary structure can be represented in various equivalent ways (see Figure 1).

The restriction to knot-free structures is necessary for efficient computation

by means of dynamic programming algorithms (55,56,95,98,119,140,149,153-

155). The memory and CPU requirements of these algorithms scale with se-

quence length n as '( n 2 ) and '( n 3 ), respectively, making structure prediction

feasible even for large RNAs of about 10000 nucleotides, such as the genomes

of RNA viruses (57,64,148). There are two implementations of various variants

of these dynamic programming algorithms: the mfold package by Michal Zu-

ker, and the Vienna RNA Package by the present authors and their collabo-

rators. The latter is freely available from http://www.tbi.univie.ac.at/.

These thermodynamic folding algorithms are based on an energy model that

considers additive contributions from stacked base pairs and various types of

loops; see e.g. (92,137). Two widely used methods for determining nucleic acid

Search WWH ::

Custom Search

Home