Biology Reference
In-Depth Information
1.1. Motif Discovery from a Biological Perspective
The basis of modern biology is the theory of evolution by natural selec-
tion introduced by Darwin and Wallace. One of the tenets of this theory
is that any genetically encoded biological structure is subject to the
randomizing forces of mutation and eventually will disappear if not con-
served by natural selection. According to Williams, 2 constancy and
complexity are biological proof of function, even in the absence of a con-
ceivable mechanism by which a conserved structure might contribute to
the organism's fitness. The lateral organ of fishes is cited as an example.
The high complexity of this organ and its high degree of conservation
across species prompted biologists to carry out experiments, which even-
tually led to the identification of its function as a sensory organ. This is
exactly the biological motivation behind motif discovery. Sequence con-
servation is evidence of natural selection and thus justifies an investment
of experimental work to elucidate the function of a motif. In fact, this
approach has been very successful in the study of protein function. There
also, the discovery of a new conserved domain has often preceded the
characterization of its molecular function. Even though this chapter is
focused on DNA motifs, many of the concepts and methods introduced
extend readily to RNA and protein sequence motifs.
A minimal degree of complexity is an essential property of a
motif, as motifs of low complexity may frequently occur by chance
and thus cannot meet the condition of overrepresentation. While the
complexity of a morphological structure can be judged by human
visual intuition, the complexity of DNA sequence motifs is typically
evaluated by a conditional entropy-based index borrowed from infor-
mation theory. 3 Unlike the search for new protein motifs, DNA motif
discovery is often targeted to a particular function, which may,
however, be broadly defined. For instance, by searching eukaryotic
promoter sequences for conserved DNA motifs, one typically expects
to find the target binding sites for a variety of a priori unknown tran-
scription factors.
Sequence motifs can also be viewed as taxonomic entities. Mastering
a bewildering diversity of phenomena through classification is a typical
Search WWH ::




Custom Search