Information Technology Reference
In-Depth Information
first approximation of the neuron behavior [i.e., until the minimum is reached
for the first time, as will be clear later (
first approximation
assumption)]. Recall
that the first approximation is exact if the weights are constrained in a subspace
of the weight vector space, just like hyperspheres or hyperplanes. Consider the
MCA EXIN linear neuron
T
y
(
t
)
=
w
(
t
)
x
(
t
)
(2.47)
n
are, respectively, the weight and the input vector. The
averaged cost function is [eq. (2.32)]
where
w (
t
)
,
x
(
t
)
∈
T
R
w
w
E
=
r
(w
,
R
)
=
w
(2.48)
T
w
where
R
=
E
x
)
is the autocorrelation matrix of the input vector
x
x
T
.
Assume that
R
is
well behaved
(i.e., full rank with distinct eigenvalues). Being an
autocorrelation matrix, it is symmetric, positive definite with orthogonal eigen-
vectors
z
n
,
z
n
−
1
,
...
,
z
1
and corresponding eigenvalues
λ
n
<λ
n
−
1
<
···
<λ
1
.
The cost function
E
is bounded from below (see Section 2.1). As seen in eq.
(2.33), the gradient of the averaged cost function, unless a constant, is given by
(
t
)
(
t
(
t
)
1
∇
E
=
[
R
w
−
r
(w
,
R
) w
]
(2.49)
T
w
w
which is used for the MCA EXIN learning law.
2.5.1 Critical Directions
The following reasonings are an alternative demonstration of Proposition 43 and
introduce the notation. As a consequence of the assumptions above, the weight
vector space has a basis of orthogonal eigenvectors. Thus,
n
w (
t
)
=
1
ω
i
(
t
)
z
i
(2.50)
i
=
From eqs. (2.49) and (2.50), it follows that the coordinates of the gradient along
the principal components are
j
=
1
λ
j
ω
2
j
(
t
)
ω
i
(
t
)
T
z
i
(
∇
E
)
=
λ
−
j
=
1
ω
(2.51)
i
w
T
(
t
) w (
t
)
2
j
(
t
)
Then the critical points of the cost landscape
E
are given by
j
=
1
λ
j
ω
2
j
(
t
)
ω
(
t
)
=
0 r
λ
−
j
=
1
ω
=
0
(2.52)
i
i
2
j
(
t
)
Search WWH ::
Custom Search