Biology Reference
In-Depth Information
are the target of optimization. The search space thus becomes discrete
and finite, but still remains astronomically large. The classical algorithm
in this category is the Gibbs sampler, 24 a stochastic variant of EM. This
approach focuses on actual motif instances rather than a generalized
description of the motif. In other words, one would primarily like to
know the locations of the functional elements in the given sequences
rather than be able to predict motif instances in new sequences. From a
theoretical and computational viewpoint, the differences are relatively
minor, since both EM and Gibbs sampling update a probability matrix.
However, whereas EM assigns k- letter subsequences probabilistically to
the two mixtures of the model, Gibbs sampling attributes them explicitly
to one or the other class. The decision whether a given subsequence x j is
included in the motif set for the next iteration is made randomly based
on its current posterior probability w j , as defined previously. The new
probability matrix is typically estimated by a maximum a posteriori
(MAP) likelihood method, for instance, by adding one pseudocount to
each base frequency:
Â
j
1
+
d
(
xb
=
)
i
M k
M
j
Œ
k
+
1
,
pi b
(, )
=
(11)
k
|
|
+
4
b and 0 otherwise. M k denotes the current set
of motif instances, and | M k | the number of elements therein.
The fact that Gibbs sampling has an in-built stochastic element
helps to overcome the risk of getting trapped in a local optimum far
away from the global optimum. Another advantage of a nondeterminis-
tic algorithm is that it potentially returns different results when run
several times from the same seed, which also increases its chances of
finding the globally optimal solution. Optimization in motif annotation
space can also be done in a deterministic fashion. For instance, one
could define the new motif instances by accepting all k -letter subse-
quences with posterior probabilities higher than 0.5. This leads to an
iterative, multiple alignment algorithm, similar to the PATOP algorithm
described in Sec. 5.
where
δ
( x i =
b ) is 1 if x i =
Search WWH ::




Custom Search