Graphics Reference
In-Depth Information
The learning procedure using gradient descent over the parameter space requires error
rates to be computed for the p th training data and for each node's output O given by:
E
O
(
)
p
L
=−
2
TO
(4.10)
ip
,
ip
,
L
ip
,
The error rate for the internal node at ( k , i ) can be derived using the chain rule:
k
+
1
k
+
1
E
O
E
O
O
O
p
p
mp
,
=
(4.11)
k
k
+
1
k
ip
,
mp
,
ip
,
m
=
1
where 1 ≤ k L − 1. Given α as a parameter of the given adaptive network, we have
E
E
O
O
*
p
p
=
(4.12)
*
α
α
*
OS
where S is the set of nodes whose outputs depend on α. We can get the derivative of
the overall error measure E with respect to α with Equation 4.13.
P
E
E
p
p
=
(4.13)
α
α
=
1
p
Furthermore, we can describe the update formula for α as Equation 4.14.
E
Δ α
=−
n
(4.14)
α
in which n is the learning rate.
Equations (4.6 to 4.14) describe the structure and learning process of the adaptive
network. In an ANFIS architecture, this network should be functionally equivalent
to a fuzzy inference system. To illustrate this mapping, consider a simple case of an
ANFIS system with two inputs x 1 and x 2 and one output, y . Suppose the rule-base
contains two fuzzy IF-THEN rules. Then we may write
Rule 1: IF x 1 is A 1 and x 2 is B 1 , THEN f 1 = p 1 x 1 + q 1 x 2 + r 1
Rule 2: IF x 1 is A 2 and x 2 is B 2 , THEN f 2 = p 2 x 1 + q 2 x 2 + r 2
where A and B are antecedents and f is the output of the neuron (node) in the same
layer, p , q and r are the parameters specific to the node. In the adaptive network,
the membership function describing an antecedent can be denoted by the following
node function.
()
1 =
O
µ
x
(4.15)
i
A i
 
Search WWH ::




Custom Search