Biology Reference
In-Depth Information
Table 9.3
The parameter set for the HMM from Example
9.3
in tabular form.
Transitions
Emissions
Initial Distribution
F
U
W
L
F
0.95
0.05
0.67
0.33
0.5
U
0.1
0.9
0.40
0.60
0.5
Table 9.4
A set of probabilities (parameters) of a HMM for a DNA sequence
where the model is only concerned with the frequencies of the individual
nucleotides. The transition matrix of the HMM is under the “Transitions” head-
ing. Each of the hidden states emits a symbol from the set
M
}
with emission probabilities listed under the “Emissions” heading. The hidden
process is equally likely to begin in the “+” and “
={
A
,
C
,
T
,
G
−
” state, as stated under the
“Initial Distribution” heading.
Transitions
Emissions
Initial Distribution
+
−
A
C
T
G
+
0.90
0.10
0.15
0.33
0.16
0.36
0.5
−
0.05
0.95
0.27
0.24
0.26
0.23
0.5
be used to construct a HMM. When we look only at the nucleotide frequencies as in
Table
9.2
we can consider a HMMwith a state space
Q
={+
,
−}
, where each of these
states can emit a symbol from the set
M
with emission probabilities
as those in Table
9.2
. Assuming that hidden process transitions between the “
={
A
,
C
,
T
,
G
}
+
” and
−
“
” states are as in Figure
9.5
(where in this case, we will identify the state
U
with
+
−
“
”) the parameters for the HMMwill be those in Table
9.4
.
If we want the model to incorporate information about dinucleotides, as in the
case of Table
9.1
, the set of emitted symbols is again
M
” and the state
F
with “
but now the
emission events at each step are not independent from one another. If, say, the process
is in the hidden state “
={
A
,
C
,
T
,
G
}
,” the probability for emitting a symbol
C
will depend upon the
symbol emitted by the previous state and whether this symbol was emitted from the
“
+
+
” or from the “
−
” hidden state. We can think of it as emitted fromone of two hidden
states
C
or
C
. Thus, for each of the emission symbols
k
∈
M
we should have states
+
−
k
and
k
in
Q
, leading to a state space
Q
={
A
+
,
A
−
,
C
+
,
C
−
,
T
+
,
T
−
,
G
+
,
G
−
}
+
−
for the hidden process. The matrix for the transitions within the subsets of the “
+
”
and “
” states should be close to those in the transition matrices in Table
9.5
but
switching between the “
−
+
” and “
−
” subsets
Q
+
={
A
+
,
C
+
,
T
+
,
G
+
}
and
Q
−
=
{
of
Q
should also be allowed with some small probability. Table
9.5
presents this scenario.
Exercise 9.5.
The HMM from Table
9.4
could be considered to be a special case
of the general model from Table
9.5
with state space
Q
A
−
,
C
−
,
T
−
,
G
−
}
={
A
+
,
A
−
,
C
+
,
C
−
,
T
+
,
T
−
,
G
+
,
G
−
}
. Give a set of HMM parameters for the general HMM from
Search WWH ::
Custom Search