Input–Output BMI Models - Brain-Machine Interface Engineering

Biomedical Engineering Reference

In-Depth Information

(

)

≡

[

(

)]

(

−

(

−

))

⋅

(

)

(

−

)

(

−

))

⋅

(

)

(

−

)

(

)

(

−

))

∑ =

(

−

))

(

)

(

−

)

(3.39)

Thus, h k is the posterior probability of expert k , given both the current value of the time series and

the recent past. For the M step, L is maximized or equivalently, the negative log-likelihood,

 

 

(

)

−

(

))

∑ = =

−

(

)

⋅

log[

(

−

))]

(

)[

log[

(3.40)





is globally minimized over the free parameters. The process is then repeated. If, in the M step, J is

only decreased and not minimized, then the process is called the generalized EM algorithm. This

is necessary when either the experts or gate is nonlinear, and a search for the global minimum is

impractical.

The first term in the summation of ( 3.40 ) can be regarded as the cross-entropy between

the posterior probabilities and the gate. It has a minimum when only one expert is valid and thus

encourages the experts to divide up the input space. To ensure that the outputs of the gate sum to

unity, the output layer of the MLP has a “softmax” transfer function,

exp[

(

)]

(3.41)

(

)

exp[

(

)]

where s k is the k th input to the softmax. For a gate implemented as an MLP, the cross entropy term

in ( 3.40 ) cannot be minimized in a single step, and the generalized EM algorithm must be em-

ployed. If the gate is trained through gradient descent (backpropagation), the error backpropagated

to the input side of the softmax, at each time step is

∂ J

g k ( ) h k

(3.42)

∂

s k

This is the same backpropagated error that would result for a MSE with the posterior prob-

abilities acting as the desired signal. Thus, the posterior probabilities act as targets for the gate. For

each EM iteration, several training iterations may be required for the gate because it is implemented

using a multilayer perceptron.

There is an analytical solution for the experts, at each iteration, when they are linear predic-

−

1 and

tors,

)

(

−

and

, where R k and p k are weighted autocorrelation and cross-

correlation matrices, respectively,

Brain-Machine Interface Engineering

Search WWH ::

Custom Search

Home