Digital Signal Processing Reference
In-Depth Information
Each B kl , k, l
=
1 ,
...
,N is a
(
p
×
p
)
square submatrix, whose mn -element is
given by
∂w ln F k
) + F k
[ B kl
(
W
)
] mn
=
w
(
W
(
W
)δ(
k
l, m
n
)
km
d x
∂w
x m f
(
x
)
,
(11.29)
ln
V k (
W
)
W
=
W
where
F k
(
W
) =
f
(
x
)
d x
(11.30)
V k (
W
)
and
δ(
k
l, m
n
)
is the 2D Kronecker delta function, i.e.,
1
k
=
l and m
=
n
δ(
k
l, m
n
) =
(11.31)
0
otherwise.
The coefficient-matrix B does not have any physical interpretation for stochas-
tic processes that are described by multivariate Fokker-Planck differential
equations, as is the training procedure of the self-organizing NN under study.
However, it is well known that for stochastic processes that are described by
single variable Fokker-Planck differential equations (also known as diffusion
processes ), the matrix B degenerates to a vector called the drift vector . 22 Let us
assume that the (real) matrix B is symmetric, so that it is diagonalizable. It is
worth noting that there is no guarantee that B is symmetric in the general case
of a stochastic process described by a multivariate Fokker-Planck differential
equation. 21 Under this condition, it can be shown that the necessary and suf-
ficient condition for convergence in the mean is that the symmetric matrix B
is positive-definite. In other words, the eigenvalues of matrix B denoted by
λ
=
...
,Np , should be positive. 24
i , i
1 , 2 ,
The careful reader would add the
additional condition lim t →∞ t
0
α(ζ)
ζ =∞
. The necessity for a symmetric
matrix B can be alleviated if the convergence analysis is described in terms of
the trace of matrix Y
d
,ashas been shown in Reference 25, and is explained
subsequently. Clearly, the convergence is of the form of a negative exponen-
tial for a constant adaptation step and it is hyperbolic when
(
t
)
α(
t
) =
1
/
t .We
elaborate the case of a constant adaptation step. In this case, Y
is a square
matrix of negative exponentials. Accordingly, each weight converges toward
the stationary solution (Equation 11.24) as a weighted sum of negative expo-
nentials of the form exp
(
t
)
( λ i α
t
)
. The time constant needed for each term to
reach the 1
/
e of its initial value is given by
1
αλ
τ
=
i .
(11.32)
i
However, the total time constant
a , defined as the time needed so that any
expected synaptic weight decays to 1
τ
/
e of its initial value, cannot be expressed
Search WWH ::




Custom Search