Database Reference
In-Depth Information
Table 7.10.
Perturbation by
Local mutation around local minima.
1
{
η
,
ε
,
α
}
←
{
η
,
ε
,
α
}
+
r
−
0
1
≤
r
≤
0
1
,
r
∈
R
1
1
n
←
n
+
r
−
5
≤
r
≤
5
,
r
∈
2
Z
k
k
2
h
←
h
+
r
−
1
≤
r
≤
1
,
r
∈
3
Z
3
s
ji
(
s
+
1
s
ji
(
s
+
1
−
1
0
≤
r
≤
1
0
,
r
∈
R
w
←
w
+
r
4
4
4
Table 7.11.
Expanding the search space by global mutation.
5
{
η
,
ε
,
α
}
←
r
0
0
≤
r
≤
1
0
,
r
∈
R
5
5
n
k
←
r
2
≤
r
≤
20
,
r
∈
N
6
6
h
←
r
1
≤
r
≤
3
,
r
∈
N
7
7
s
ji
(
s
+
1
w
←
r
−
5
.
≤
r
≤
5
0
,
r
∈
R
8
8
8
Fitness Function with AIC
GA search intends to find a better network structure to fit the environment.
Therefore, we should define a fitness function, which evaluates the error function
and the network structure. We adopt the information criterion AIC [19] to evaluate
the network structure.
The Network Evaluation with AIC [20]
AIC evaluates the goodness of fit of given models based on the mean square error
for training data and the number of parameters as follows:
.
AIC
=
−
2
(max_
log_
likelihood
)
+
2
F
(7.29)
The
F
is the nu
m
ber of free parameters.
Let
=
o
−
o
as the error for input pattern
p
;
is the output pattern
e
o
p
p
p
p
for the input pattern of training case
p
, and
o
is an average of
o
p
.
o
and
o
p
p
p
( )
are normally distributed
0 σ
N
and independent of each other. The likelihood
of error for the training data is given by
( )
2
I
K
P
⎛
−
1
⎞
−
∏
=
2
T
p
L
=
2
πσ
exp
⎝
e
e
⎠
.
(7.30)
2
p
2
2
σ
p
1
The logarithm of Eq. (7.30) gives the following:
( )
P
KP
1
2
∑
=
T
p
log(
L
)
=
l
=
−
log
2
πσ
−
e
e
p
2
2
2
σ
p
1
( )
KP
1
2
=
−
log
2
πσ −
E
(
W
).
(7.31)
2
2
2
σ
()
E
can be minimized by BP learning based on the steepest gradient descent.
As a result it enables us to obtain the maximum likelihood in Eq. (7.31).
Suppose the neural network has three layers:
M
input neurons,
H
hidden
neurons, and
K
output neurons. This network has
H
(
M+K
) connection weights and
Search WWH ::
Custom Search