Biology Reference
In-Depth Information
A 10
5
3
5
10
14
12
10
10
6
5
2
C
10
13
13
7
15
13
3
G
14
13
11
5
12
13
2
10
11
T
5
5
5
9
5
5
A
C
C
C
T
T
T
G
A
T
C
T
T
T
A
T
G
G
G
A
A
A
C
T
A
G
A
A
A
T
Fig. 2. Energy matrix for a transcription factor binding site. An energy matrix rep-
resents one possible physical interpretation of a weight matrix. Each element of the
matrix quantitatively defines the binding energy (in arbitrary units) between a DNA
base pair and a corresponding compartment of the DNA-binding surface. Energy
matrices can be used to compute binding constants for DNA protein complexes, and
therefore represent a special case of QSAR (quantitative structure-activity relation-
ship) models. Note that the sign of the energy units is reversed; a high weight matrix
score signifies low energy value, and thus high binding strength.
sequence motif (ribosome binding sites) was inspired by a machine learn-
ing method called “perceptron”. 7
In summary, DNA motif discovery is not an isolated, specialized topic
for a closed circle of bioinformaticians. DNA motifs have many facets and
have different meanings to different researchers. Mathematically equiva-
lent descriptors have been used in many more fields, even outside life
sciences. Hidden Markov models, 8
for instance, were developed in the
speech recognition field.
In the following sections, a personal view of motif discovery will be
presented inspired partly by the author's own work. The focus will be on
essential concepts and open questions. Methods will be presented in their
most basic version; a comprehensive review of current state-of-the-art
motif discovery algorithms is beyond the scope of this chapter. Further
references on methods can be found in recent reviews. 9,10
Search WWH ::




Custom Search