Biology Reference
In-Depth Information
The condition from Eq. ( 9.7 ) in the case of only two hidden states means that if we
consider a threshold of 0.5 for Figure 9.8, the hidden states with posterior probabilities
plotted above the horizontal line at 0.5 will be predicted as U states. The overlap is
not perfect, of course, but, as our next exercise shows, the separation between the
states can be even better for hidden processes that only switch between states with
very small probability.
Exercise 9.12. Use the Dishonest Casino application in the CpG Educate suite to
experiment with the evaluation algorithms to plot the posterior probabilities of being
in a state generated by the unfair die for sequences x of various lengths. 10 Do the same
for several sets of transition probabilities, focusing specifically on the two extremes,
as in Exercise 9.8 : (a) transition probabilities that are close to uniform, and (b) distri-
butions for which the process retains its current state with a large probability. Consider
values as large as 0.999 for p and q in this case. Summarize your observations, by
answering the following questions: (1) Did you detect any improvement when the
sets of parameters are less like those from part (a) and more like those in part (b)?
If so, in what sense?; (2) Consider several sets of transition probabilities to illustrate
that when the probabilities for switching between the two hidden states get smaller,
the performance of posterior decoding improves.
Exercise 9.13. Repeat Exercise 9.12 but, this time, try different values for the emis-
sion probabilities. Experiment with posterior decoding to get a sense that its perfor-
mance improves as the emission distributions for the different states become “more
different.” If, for instance, both emission distributions are nearly or exactly uniform
(e.g., {0.1167, 0.1167, 0.1167, 0.1167, 0.1167, 0.1167} and {0.15, 0.15, 0.2, 0.15,
0.15, 0.2}) the decoding into U and F states will be generally poor. As the emissions
distribution for the unfair die becomes more skewed, the performance of the posterior
decoding method improves.
Exercise 9.14 Use the CpG Islands application in the CpG Educate suite to simu-
late sequences of various lengths and compare the performance of Viterbi decoding
vs. Posterior Decoding. Use HMM parameters in the form of Table 9.5 first, then
experiment with general sets of HMM parameters. Use the file Table_9.5.xlsx from
the volume's website to generate sets of HMM parameters in the format of Table 9.5
for different values of p and q . (Do not forget that the file needs to be saved in CSV
format before loading the parameters into the CpG Islands application. See Exercise
9.9 and [ 30 ] for more details.)
Exercise 9.15. Repeat Exercise 9.14 for the Dishonest Casino application. Com-
pared to the HMM from Exercise 9.14, the Dishonest Casino application has fewer
parameters, so you will not need to upload them from a file - just change the parameter
values by typing over the default values that are provided.
10 For optimal viewing of the posterior probabilities, use sequences with lengths between 300 and 1200.
 
Search WWH ::




Custom Search