Biology Reference
In-Depth Information
where the sum is over all possible hidden sequences
π
=
π
1
π
2
π
3
···
π
l
with
π
t
∈
Q
and where
l
P
(
x
,π)
=
a
0
π
1
e
π
i
(
x
i
)
a
.
(9.2)
π
i
π
i
+
1
i
=
1
As with the decoding problem, the mathematical solution from Eq. (
9.1
) has no
practical value since the number of such paths grows exponentially as the length
of the path
l
grows.
3.
Training (Learning) Problem:
Given an observed sequence
x
or a set of observed
sequences, what are the HMM parameters that make the sequence
x
most likely
to occur? The answer to this question provides estimates for the parameters of the
HMM from a data set of observed sequences.
9.4.1
Decoding: The Viterbi algorithm
The Viterbi algorithm is a recursive and computationally efficient method for com-
puting the most likely state sequence
π
max
for a given observed sequence
x
=
x
1
x
2
x
3
···
x
l
(see [
28
] and also [
29
] for an interesting account of the history by
Andrew Viterbi himself).
To understand the recursive step, let's assume that for any state
j
∈
Q
we have
somehow computed the hidden sequence
π
1
π
2
π
3
···
π
t
−
2
π
t
−
1
of highest probability
among those emitting the observations
x
1
x
2
x
3
···
x
t
−
1
up to time
t
−
1 with
π
t
−
1
=
j
.
That is, we assume that for each
j
∈
Q
, we have determined the most probable path
of length
t
−
1 for the data
x
1
x
2
x
3
···
x
t
−
1
, ending in state
j
. Denote the probability
of this path by
v
j
(
t
−
1
)
:
v
j
(
t
−
1
)
=
max
π
=
π
1
π
2
π
3
···
π
t
−
1
P
(π
t
−
1
=
j
,
x
t
−
1
),
j
∈
Q
.
Next, we can use
v
j
(
t
−
1
)
to compute the most probable path of length
t
ending in
each of the states
k
∈
Q
and emitting the sequence
x
1
x
2
x
3
···
x
t
:
v
k
(
t
)
=
max
π
=
π
1
π
2
π
3
···
π
t
P
(π
t
=
k
,
x
t
).
The probability
of the most likely path of length
t
ending in
k
and emitting
x
t
will be the largest among the probabilities for hidden sequences that get into
j
at time
t
v
k
(
t
)
−
1 with a maximal probability, transition from
j
to
k
at time
t
, and emit
x
t
. Thus
v
k
(
)
=
Q
{
v
j
(
−
)
a
jk
e
k
(
x
t
)
}=
e
k
(
x
t
)
Q
{
v
j
(
−
)
a
jk
}
.
t
max
j
t
1
max
j
t
1
∈
∈
Iterating this argument for all
t
=
1
,
2
,...,
l
will allow us to compute the probability
of the most likely path
π
=
π
1
π
2
π
3
···
π
l
for the data
x
=
x
1
x
2
x
3
···
x
l
. To be able to
recover the path
x
=
x
1
x
2
x
3
···
x
l
itself, for each time
t
and for each state
k
∈
Q
,we
keep pointers (ptr) to remember the state
r
∈
Q
from which the maximal probability
path ending in
k
came. For each
t
=
1
,
2
,...,
l
and
k
∈
Q
we record
k
's predecessor
Search WWH ::
Custom Search