Digital Signal Processing Reference
In-Depth Information
given in [38, 39, 41] can be considerably simplified by using Wirtinger derivatives. As
we demonstrate in the next example, the split activation functions are not efficient in
their use of the information when learning nonlinear mappings, and hence are not
desirable for use as activation functions.
B EXAMPLE 1.7
In Figure 1.18, we show the convergence characteristics of two MLP filters, one
using split tanh and a second one that uses the complex tanh as the activation
function. The input is generated using the same model as in Example 1.5 with
1 =
p and the nonlinear output of the system is given as d ( n ) þ 0 : 2 d 2 ( n )
where d ( n ) ¼ w opt x ( n ) with the coefficients w opt selected as in Example 1.5. The
size of the input layer is chosen as 5 to match the memory of the finite impulse
response component of the system, and the filter has a single output. The stepsize
is chosen as 0.01 for both the split and the fully complex MLP filters and the
convergence behavior is shown for two different filter sizes, one with a filter
using 15 hidden nodes and a second one with 40 hidden nodes. As observed in
the figures, the MLP filter using a fully complex activation function produces
lower squared error value, however the performance advantage of the fully com-
plex filter decreases when the number of hidden nodes increases as observed in
Figure 1.18 b .
The example demonstrates that the fully complex MLP filter is able to use infor-
mation more efficiently, however, the performance of the filter that uses a split-type
activation function can be improved by increasing the filter complexity. Note that
the universal approximation of MLP filters can be demonstrated for both filter
types, and the approximation result guarantees that the MLP structure can come
Figure 1.18 Performance of two (a) 5-15-1 and (b) 5-40-1 MLP filters for a nonlinear
system identification problem using split and fully-complex tanh activation functions.
Search WWH ::




Custom Search