Neural Decoding Using Generative BMI Models - Brain-Machine Interface Engineering

Biomedical Engineering Reference

In-Depth Information

Although continuous and semicontinuous HMMs have been developed, discrete output

HMMs are often preferred in practice because of their relative computational simplicity and re-

duced sensitivity to initial parameter settings during training [ 39 ]. A discrete hidden Markov chain

depicted in Figure 5.5 consists of a set of n states, interconnected through probabilistic transitions,

and it is completely defined by the triplet, λ = [ a , B , π], where a is the probabilistic n × n state

transition matrix, B is the L × n output probability matrix (with L discrete output symbols), and π

is the n -length initial state probability distribution vector [ 39 , 40 ].

With respect to the Kalman filter, the HMMs have discrete hidden state variables (as op-

posed to continuous). Additionally, the HMM has a multistage stochastic state transition, whereas

the Kalman filter follows a single-step Markov model, following a known deterministic rule, with the

randomness appearing through the additive process noise. A graphical model representation of the

two topologies in Figure 5.5 makes this similarity even more apparent.

In contrast to the Gaussian noise model that is used for the Kalman filter, the HMM is simi-

lar to the particle filter in that it can represent an arbitrary distribution for the next value of the state

variables. The differences between the two models become more apparent in the training because

the Kalman filter updates are performed sample by sample, whereas in HMMs batch updates are

normally used. For an observation sequence O , we locally maximize P ( O | λ) (i.e., the probability of

the observation sequence O given the model λ) by finding the maximum likelihood estimate using

an expectation maximization (EM) algorithm called the Baum-Welch algorithm [ 40 ].

Recall the definition of the maximum-likelihood estimation problem. We have a density

function p ( O |λ) that is governed by the set of parameters λ and a data set of size N , supposedly

drawn from this distribution, that is, O = [ O 1 , ... , O N ]. That is, we assume that these data vectors are

i.i.d. with distribution p . Therefore, the resulting density for the samples is

N

λ =

=

p O

(

|

)

p O i

(

|

λ

)

L

(

λ

|

O

)

∏

(5.32)

i

=

1

FIgURE 5.5: HMM (left) and Kalman filter (right). Squares denote discrete quantities, circles conti-

nuous quantities. White denotes unobservable quantities, and gray observable quantities. Arrows mean

dependencies.

Brain-Machine Interface Engineering

Search WWH ::

Custom Search

Home