Digital Signal Processing Reference
In-Depth Information
chain rule. From Figure 7.8, if we mentally build a signal-flow graph from
each focused weight up to the output, we can obtain
u o
J BP (
n
)
J
(
n
)
) ·
e
(
n
)
y
(
n
)
) ·
(
n
)
·
y i (
n
)
) ·
u i (
n
)
=
) ·
u o
w ij
e
(
n
y
(
n
(
n
y i (
n
)
u i (
n
w ij
w i ·
φ [ u i (
=
2
·
e
(
n
) · (
1
) · (
1
) ·
n
)
]
·
x j (
n
)
w i φ [ u i (
=−
2 e
(
n
)
n
)
] x j (
n
)
(7.33)
Notice that we may express the output in a compact form as
w 0
w oT 1 φ y int
T
y
(
n
) =
(
n
)
(7.34)
w 1 , w 2 ,
, w o N neuron .
) = 1, y 1 (
) and w oT
where y int
(
n
n
)
,
...
, y N neuron (
n
=
...
related to the weights in the output layer w o
Thus, the gradient of J BP (
n
)
and the bias w 0 are given by
J BP (
n
)
y int
=−
2 e
(
n
)
(
n
)
(7.35)
w o
J BP
(
n
)
=−
2 e
(
n
)
(7.36)
w 0
Now, if we define
] T
x
(
n
) =
[ x 1 (
n
)
x 2 (
n
)
···
x K (
n
)
(7.37)
we can express the gradient of J BP (
n
)
with respect to the i th set of weights
present in the hidden layer by
J BP (
n
)
w i φ [ u i (
=−
2 e
(
n
)
n
)
] x
(
n
)
, i
=
1,
...
, N neuron
(7.38)
w i
, w iK ] T .
Having found the gradient vector with respect to the weights of the hid-
den layer and of the output layer, we are in a position to update all weights
of our network in the spirit of the steepest-descent method. However, we
must not forget that we purposely calculated the derivatives with respect to
an instantaneous squared error, which means that the resulting BPA will
be conceptually similar to the LMS procedure. This approach constitutes
the online BPA, which is particularly suitable to real-time applications. It
is expressed by Algorithm 7.1 .
being w i
=
[ w i 1 , w i 2 ,
...
 
Search WWH ::




Custom Search