Biomedical Engineering Reference
In-Depth Information
Although continuous and semicontinuous HMMs have been developed, discrete output
HMMs are often preferred in practice because of their relative computational simplicity and re-
duced sensitivity to initial parameter settings during training [ 39 ]. A discrete hidden Markov chain
depicted in Figure 5.5 consists of a set of n states, interconnected through probabilistic transitions,
and it is completely defined by the triplet, λ = [ a , B , π], where a is the probabilistic n × n state
transition matrix, B is the L × n output probability matrix (with L discrete output symbols), and π
is the n -length initial state probability distribution vector [ 39 , 40 ].
With respect to the Kalman filter, the HMMs have discrete hidden state variables (as op-
posed to continuous). Additionally, the HMM has a multistage stochastic state transition, whereas
the Kalman filter follows a single-step Markov model, following a known deterministic rule, with the
randomness appearing through the additive process noise. A graphical model representation of the
two topologies in Figure 5.5 makes this similarity even more apparent.
In contrast to the Gaussian noise model that is used for the Kalman filter, the HMM is simi-
lar to the particle filter in that it can represent an arbitrary distribution for the next value of the state
variables. The differences between the two models become more apparent in the training because
the Kalman filter updates are performed sample by sample, whereas in HMMs batch updates are
normally used. For an observation sequence O , we locally maximize P ( O | λ) (i.e., the probability of
the observation sequence O given the model λ) by finding the maximum likelihood estimate using
an expectation maximization (EM) algorithm called the Baum-Welch algorithm [ 40 ].
Recall the definition of the maximum-likelihood estimation problem. We have a density
function p ( O |λ) that is governed by the set of parameters λ and a data set of size N , supposedly
drawn from this distribution, that is, O = [ O 1 , ... , O N ]. That is, we assume that these data vectors are
i.i.d. with distribution p . Therefore, the resulting density for the samples is
N
λ =
=
p O
(
|
)
p O i
(
|
λ
)
L
(
λ
|
O
)
(5.32)
i
=
1
FIgURE 5.5: HMM (left) and Kalman filter (right). Squares denote discrete quantities, circles conti-
nuous quantities. White denotes unobservable quantities, and gray observable quantities. Arrows mean
dependencies.
 
Search WWH ::




Custom Search