Input–Output BMI Models - Brain-Machine Interface Engineering

Biomedical Engineering Reference

In-Depth Information

K

=

g

k n

(

)

=

1

(3.48)

k

1

Such a linear mixture can represent either a competitive or cooperative system, depending on

how the experts are penalized for errors, as determined by the cost function. In fact, it was in the

context of introducing their mixture of experts model that Jacobs et al.[ 53 ] first presented a cost

function that encourages competition among gated expert networks, which we generalize to

N

∑ =

J

(

n

)

=

g

(

n

)

⋅

f

(

d

(

n

)

−

y

(

χ

(

n

)))

(3.49)

k

1

where d is the desired signal and f d ( ) - k χ ( ( ( ) is a function of the error between the desired sig-

nal and the k th expert. Because the desired signal is the same for all experts, they all try to regress the

same data and are always in competition. This alone, however, is not enough to foster specialization.

The gate uses information from the performance of the experts to produce the mixing coefficients.

There are many variations of algorithms that fall within this framework. Let us discuss the impor-

tant components one at a time, starting with the design of the desired signal.

The formalism represented by ( 3.49 ) is a supervised algorithm, in that it requires a desired

signal. However, we are interested in a completely unsupervised algorithm. A supervised algorithm

becomes unsupervised when the desired signal is a fixed transformation of the input: d d ( ⇒ .

Although many transformations are possible, the two most common transformations involve the de-

lay operator and the identity matrix, resulting in prediction as explained above and auto-association,

which yields a generative model for PCA [ 54 ].

Gates can be classified into two broad categories, which we designate as input or output

based. With input-based gating, the gate is an adaptable function of the input, g k g k ( ⇒ , that

learns to forecast which expert will perform the best, as we have seen in the mixture of experts.

For output-based gating, the gate is a directly calculated function of the performance, and hence,

the outputs, of the experts. The gate in the annealed competition of experts of Pawelzik et al. [ 55 ]

implements memory in the form of a local boxcar average squared error of the experts. The self-

annealing competitive prediction of Fancourt and Principe also uses the local squared error, but us-

ing a recursive estimator. The mixture of experts can also keep track of past expert performance, the

simplest example of which is the mixture model where the gate is expanded with memory to create

an estimate of the average of the posterior probabilities over the data set [ 51 ]. Perhaps, the simplest

scheme is hard competition, for which the gate chooses the expert with the smallest magnitude er-

ror in a winner-take-all fashion, as will be explained below. This method simplifies the architecture

and is very appropriate for system identification in a control framework because it simplifies the

design of controllers.

Brain-Machine Interface Engineering

Search WWH ::

Custom Search

Home