Biology Reference
In-Depth Information
where the sum is over all possible hidden sequences
π = π 1 π 2 π 3 ··· π l with
π t
Q and where
l
P
(
x
,π) =
a 0 π 1
e
π i (
x i )
a
.
(9.2)
π i π i + 1
i
=
1
As with the decoding problem, the mathematical solution from Eq. ( 9.1 ) has no
practical value since the number of such paths grows exponentially as the length
of the path l grows.
3. Training (Learning) Problem: Given an observed sequence x or a set of observed
sequences, what are the HMM parameters that make the sequence x most likely
to occur? The answer to this question provides estimates for the parameters of the
HMM from a data set of observed sequences.
9.4.1 Decoding: The Viterbi algorithm
The Viterbi algorithm is a recursive and computationally efficient method for com-
puting the most likely state sequence
π max for a given observed sequence x
=
x 1 x 2 x 3 ···
x l (see [ 28 ] and also [ 29 ] for an interesting account of the history by
Andrew Viterbi himself).
To understand the recursive step, let's assume that for any state j
Q we have
somehow computed the hidden sequence
π 1 π 2 π 3 ··· π t 2 π t 1 of highest probability
among those emitting the observations x 1 x 2 x 3 ···
x t 1 up to time t
1 with
π t 1 =
j .
That is, we assume that for each j
Q , we have determined the most probable path
of length t
1 for the data x 1 x 2 x 3 ···
x t 1 , ending in state j . Denote the probability
of this path by
v j (
t
1
)
:
v j (
t
1
) =
max
π = π 1 π 2 π 3 ··· π t 1
P
t 1 =
j
,
x t 1 ),
j
Q
.
Next, we can use
v j (
t
1
)
to compute the most probable path of length t ending in
each of the states k
Q and emitting the sequence x 1 x 2 x 3 ···
x t :
v k (
t
) =
max
π = π 1 π 2 π 3 ··· π t
P
t =
k
,
x t ).
The probability
of the most likely path of length t ending in k and emitting x t
will be the largest among the probabilities for hidden sequences that get into j at time
t
v k (
t
)
1 with a maximal probability, transition from j to k at time t , and emit x t . Thus
v k (
) =
Q { v j (
)
a jk e k (
x t ) }=
e k (
x t )
Q { v j (
)
a jk } .
t
max
j
t
1
max
j
t
1
Iterating this argument for all t
=
1
,
2
,...,
l will allow us to compute the probability
of the most likely path
π = π 1 π 2 π 3 ··· π l for the data x
=
x 1 x 2 x 3 ···
x l . To be able to
recover the path x
=
x 1 x 2 x 3 ···
x l itself, for each time t and for each state k
Q ,we
keep pointers (ptr) to remember the state r
Q from which the maximal probability
path ending in k came. For each t
=
1
,
2
,...,
l and k
Q we record k 's predecessor
 
Search WWH ::




Custom Search