Digital Signal Processing Reference
In-Depth Information
Each
B
kl
,
k, l
=
1
,
...
,N
is a
(
p
×
p
)
square submatrix, whose
mn
-element is
given by
∂
∂w
ln
F
k
)
+
F
k
[
B
kl
(
W
)
]
mn
=
w
(
W
(
W
)δ(
k
−
l, m
−
n
)
km
d
x
∂
∂w
−
x
m
f
(
x
)
,
(11.29)
ln
V
k
(
W
)
W
=
W
where
F
k
(
W
)
=
f
(
x
)
d
x
(11.30)
V
k
(
W
)
and
δ(
k
−
l, m
−
n
)
is the 2D Kronecker delta function, i.e.,
1
k
=
l
and
m
=
n
δ(
k
−
l, m
−
n
)
=
(11.31)
0
otherwise.
The coefficient-matrix
B
does not have any physical interpretation for stochas-
tic processes that are described by multivariate Fokker-Planck differential
equations, as is the training procedure of the self-organizing NN under study.
However, it is well known that for stochastic processes that are described by
single variable Fokker-Planck differential equations (also known as
diffusion
processes
), the matrix
B
degenerates to a vector called the
drift vector
.
22
Let us
assume that the (real) matrix
B
is symmetric, so that it is diagonalizable. It is
worth noting that there is no guarantee that
B
is symmetric in the general case
of a stochastic process described by a multivariate Fokker-Planck differential
equation.
21
Under this condition, it can be shown that the necessary and suf-
ficient condition for convergence in the mean is that the symmetric matrix
B
is positive-definite. In other words, the eigenvalues of matrix
B
denoted by
λ
=
...
,Np
, should be positive.
24
i
,
i
1
,
2
,
The careful reader would add the
additional condition lim
t
→∞
t
0
α(ζ)
ζ
=∞
. The necessity for a symmetric
matrix
B
can be alleviated if the convergence analysis is described in terms of
the trace of matrix
Y
d
,ashas been shown in Reference 25, and is explained
subsequently. Clearly, the convergence is of the form of a negative exponen-
tial for a constant adaptation step and it is hyperbolic when
(
t
)
α(
t
)
=
1
/
t
.We
elaborate the case of a constant adaptation step. In this case,
Y
is a square
matrix of negative exponentials. Accordingly, each weight converges toward
the stationary solution (Equation 11.24) as a weighted sum of negative expo-
nentials of the form exp
(
t
)
(
−
λ
i
α
t
)
. The time constant needed for each term to
reach the 1
/
e
of its initial value is given by
1
αλ
τ
=
i
.
(11.32)
i
However, the total time constant
a
, defined as the time needed so that any
expected synaptic weight decays to 1
τ
/
e
of its initial value, cannot be expressed
Search WWH ::
Custom Search