Biology Reference
In-Depth Information
where the simulations have been generated by using the fair die). Are there any
emerging patterns? Be mindful that due to the stochastic nature of the process, you
should generate several sequences for each set of parameters. Try to summarize your
observations, by answering the following questions:
1. Did you detect any difference in the performance of the Viterbi algorithm for the
sets of parameters from part (a) and (b)?
2. If you answered yes in (1), is the performance of the algorithm to decode the
hidden sequence better for distributions like those in part a) or for distributions
like those in part (b)?
3. Does the performance of the Viterbi algorithm improve when the values of and
are very close to 1?
4. What explanations can you provide for your findings for questions (1)? It may
be helpful to revisit Exercises 9.1 and 9.2.
Exercise 9.9. Repeat Exercise 9.8 , this time experimenting with the CpG Islands
application in the CpG Educate suite. Consider distributions such as in Table 9.5 and
a) values for p and q close to 0.5; b) values for p and q close to 1. Experiment with
both short and long sequences and notice that when p or q are very close to 1, the
hidden process and/or the maximal probability path may never switch between the
” states for relatively short sequence. Use the file Table_9.5.xlsx available
from the volume's website to generate sets of HMM parameters for different values of
p and q . Be sure to save the generated sets of HMM parameters in Comma Separated
Values (CSV) format and use the CSV fileto load the parameters into the CpG Islands
application (refer to the CpG Educate tutorial [ 30 ] for more details).
+
” and “
9.4.2 Evaluation and Posterior Decoding:
The Forward-Backward Algorithm
In this section we outline a computationally efficient algorithm for computing P
(
x
)
,
the probability for an observed sequence x
=
x 1 x 2 ···
x l . We also discuss an alterna-
tive decoding method called posterior decoding.
The forward algorithm is similar in idea to the Viterbi algorithm. Instead of keep-
ing track of the hidden sequences of maximal probability in state j at time t ,the
forward algorithm computes the probabilities of being in state j , having observed
the first t symbols x 1 x 2 ···
x t of the sequence x . Denote these probabilities by
f j (
) =
(
x 1 x 2 ···
x t t
=
),
t
P
j
j
Q . Recursively, as in the Viterbi algorithm,
we can now compute
f k (
t
+
1
) =
P
(
x 1 x 2 ···
x t + 1 t + 1 =
k
),
the probability that the process is in state k at time t
+
1 with emitted sequence
x 1 x 2 ···
x t + 1 :
f k (
t
+
1
) =
P
(
x 1 x 2 ···
x t + 1 t + 1 =
k
) =
f j (
t
)
a jk e k (
x t + 1 ),
k
Q
.
j
Q
 
Search WWH ::




Custom Search