Neurocomputing in Complex Domain - High Dimensional Neurocomputing: Growth, Appraisal and Applications

Information Technology Reference

In-Depth Information

It should be noted that on equating the denominator of both expressions to zero,

we observe that the function goes unbounded at points of the type

(

)ˀ)

for

any natural number 'n'. Figure 3.2 a is a plot of the real part and Fig. 3.2 b is a plot

of the imaginary part of Haykin activation function. Both figures are characterized

by prominent peaks (singular points). Invoking the Liouville theorem, we find that

the functions are unbounded and hence qualify as activation functions. To avoid the

singular points, the inputs to the neuron should be scaled to a region that is devoid

of them [ 21 ]. In the update rule for the weights with Haykin activation function

the derivative of Eq. 3.1 with respect to z needs to be computed, which is f C (

2 n

)

(

. The surface plots of the expressions for real and imaginary parts of the

derivative of Haykin's activation function with respect to real part x the imaginary

part y 1 are displayed in Fig. 3.2 c-f, respectively. All surfaces, as can be seen, are

characterized by peaks. The backpropagation learning algorithm developed with

the Haykin activation function has singularities at the countably many points, the

derivative of the Haykin activation also vanishes at these points.

−

f C (

))

3.2.3.2 Problem with Haykin's Activation Function

or exp − z 2

is extended from real to complex, it is seen that if z approaches any value in the set

{

When the domain of conventional activation like sigmoid in Eq. 3.1 ,tan

(

)

is unbounded. It

was also suggested inLeung andHaykin [ 21 ] that to avoid the problemof singularities

in the sigmoid function f

(

2 n

) ˀ }

where n is integer, then

f C (

) | →∞

thus f C (

)

, the input data should be scaled to some region in the

complex plane. The position of singularities disturb the training scheme as whenever

some intermediate weights fall in the vicinity of the singular points, it was observed

that the whole training process down the line receives a jolt. This is revealed by the

error plot of the function, which is characterized by peaks. The typical point scatter

shown in the Fig. 3.3 a is a distribution of the hidden layer weights as the training

process is on. The figure shows four singular points of the Haykin activation that are

completely engulfed by the cloud of points. As can be observed, they cluster around

the some singular points which eventually results in the peak type error-epochs

characteristic. The typical error function graph with Haykin activation is shown

in Fig. 3.3 b. The training process produced many peaks as a result of the singular

activation configurations encountered because of the activation function's singular

points. The complex backpropagation developed over this activation functions fails

to solve many problems.

In case of complex-valued networks, T Adali (2003) broadly categorized the fully

complex-valued activation functions by their properties [ 11 ] into three types. It was

also shown that universal approximation can be achieved for each of them. The

first type of complex-valued functions concerns the functions without any singular

points. These functions can be used as activation functions and the networks with

this type of activation functions are shown as good approximators. Although some of

(

)

1 Finding the expressions for real and imaginary parts of the derivative of Haykin's activation

function with respect to real part x the imaginary part y are left for interested readers.

High Dimensional Neurocomputing: Growth, Appraisal and Applications

Search WWH ::

Custom Search

Home