Geology Reference
In-Depth Information
ing into account that the sigmoid transfer function
h ( t ) de Eq.(4) produces values between 0 and 1.
To start the optimization for the weights W ,
initial values are chosen randomly from an uni-
form distribution between -5 and 5. These limits
are arbitrarily chosen, but are considered to be
sufficiently wide to lead to a wide range in the
predicted outputs Y ( k ). For each weight selection
(e.g., 5000 combinations in this Chapter) the
corresponding prediction error E is calculated
from Eq.(5),
for the new 80% of the data corresponding to the
greatest errors in absolute value. This process is
repeated NREP times (100 in the work presented
here), looking for the possibility of local minima.
The final solution corresponds to that W 0 with the
minimum overall error E 0 .
3.3 Subsequent Optimization
Using Gradients
The search-based algorithm described in 4.2 can be
complemented with a gradient-based optimization
starting from the final weights W 0 . The following
alternatives may be pursued:
NP
( ) 2
( )
(
)
E
=
Y k
T k
(5)
k
=
1
A. Search for weights W trying to zero out the
error E
keeping the combination of weights W 0 which
correspond to the minimum among the calculated
errors E .
The absolute value of the NP errors ( Y ( k ) -
T ( k )), corresponding to the weights W 0 , are ranked
from smallest to largest. The set corresponding
to 80% of the data with the largest errors is used
for the network training, with the remaining 20%
available for validation.
With W 0 representing an “anchor” point with
corresponding error E 0 over the training set , new
values for the weights W are randomly chosen
within a neighborhood of W 0 , and the correspond-
ing errors E are calculated with Eq.(5). Whenever
a set W is found for which E < E 0 , this set becomes
the new W 0 and the search process is repeated
around this new anchor. The maximum number
of repetitions is limited to a prescribed number
(500 in the work presented here).
The number of samples of weights W around
a given anchor W 0 is limited, e.g., to 1000. If this
limit is reached and all errors satisfy E > E 0 , then
convergence has been achieved and the mínimum
solution is estimated to be in correspondence with
the anchor W 0 .
If either convergence has been achieved, or
the number of repetitions has exceded the stipu-
lated maximum, then the NP individual errors
are calculated and ranked, re-initiating the search
If the error E is approximated as a linear func-
tion around E 0 ,
Τ
E E
=
+
G W W
(
)
(6)
0
0
0
in which G o is the gradient vector at W o ,
E
W
G
=
(7)
0
i
i W
0
the objective is to find W to make E = 0. Thus, the
vector W - W 0 is taken along the gradient direction
(but in the negative direction):
(
W W
)
G
= −
λ
(8)
0
0
The step magnitude λ is then calculated from
E
E E
=
+
G
Τ
(
λ
G
)
=
0
λ
=
0
0
0
0
Τ
G G
0
0
(9)
giving a new W as
 
Search WWH ::




Custom Search