Information Technology Reference
In-Depth Information
where w X (
/f X (
x
)
is defined as
1
x
)
,
N
is the number of neurons in the MLP
R + are algorithm optimization values
which depend on the specific application of the AMLP algorithm. Values for
A
input layer, and parameters
A
and
B
have been empirically determined. Eq. (6) is a gaussian distribution,
so it has been assumed that
and
B
X
pdf is Gaussian (if it is not the case, the real
pdf should be used instead). Then, w X (
x
)
X
has high values for un-frequent x
values and close to 1 for the frequent ones and can therefore be straightforwardly
applied in weights updating procedure to model the biological metaplasticity
during learning.
5 AMP in MLP Training: AMMLP
In the case of an MLP trained with BPA applied to
,
previous studies have shown that the output for each class is the MLP inherent
estimation of
L
classes, H l , l
=0
,
1
, ..., L
1
a posteriori
probability of the class [16], based on Bayes Theorem,
we then have:
f X (
x
|
H l )
.P
(
H l )
y l =
P
(
H l |
x
)=
(7)
f X (
x
)
This enables a direct implementation of metaplasticity. For each class, by as-
suming the proposed AMP model described in subsection 4.2 can be make
f X (
x
)=
f X (
x
)
and from Eq.(7) and Eq. (4)
e
(
x
|
H l )
E
(
x
)
f X (
x
|
H l )
M l
M l
E
(
x k )
f X (
x
|
H l )
E M l =
f X (
x k )
k =1
M l
M l
y l
=
E
(
x k )
(8)
P
(
H l )
k =1
where k
=1
,
2
..., M l , are the independent sample vectors of class
l
in the training
set. Then, from Eq. (8) and Eq. (4)
y l
1
f (
=
(9)
P
(
H l )
x
)
Eq. (7) takes advantage of the inherent
probability estimation for
each input class of MLP outputs, so it is used to quantify a pattern's frequency.
Note that if this is not the case, as it happens in first steps of BPA training
algorithm, the training may not converge. In this first steps, the outputs of the
MLP does not provide yet any valid estimation of the
a posteriori
probabilities,
but rather random values corresponding to initial guess of the MLP weights,
a posteriori
W
. It is then better in these first steps of training, either to apply ordinary BPA
training or to use another valid weighting function till BPA starts to minimize
 
Search WWH ::




Custom Search