Information Technology Reference
In-Depth Information
MSE
E
{[ ()
p x
f
()]}
x
2
D
D
is taken, where the training points are represented by the function f ( x ) and the
fitting polynomial or the actual network output by p ( x ). Expanding the
MSE
D
formally as
MSE
E
{[()
px
E
{()}
px
E
{()}
px
f x
()]}
2
D
D
D
D
and rearranging its expansion as
2
2
MSE
E
{[ ()
p x
E
{ ()}]}
p x
E
{ { ()}
E
p x
f
()]}
x
,
D
D
D
D
D
one gets the sum of the statistical variance
VAR
E
{[
p x
( )
E
{
p x
( )}] }
2
D
D
D
and the statistical bias
2
{{( } ( ]}
BIAS
E
E
p x
f
x
.
D
D
D
In summary, the optimal network size is essential for optimal problem solving
because a relatively small network will not be able to fit the given data accurately
and thus will not be able to learn the most important features incorporated in the
data. For this reason, the network size should be increased. On the other hand,
because a large-sized network tends to learn not only the characteristic features of
the given data, but also the accompanying noise and other non-relevant
components' idiosyncrasies hidden in the data, its size should be reduced. In both
cases, a network size reduction and/or an increase in optimal network size should
be found that ensures the optimal network performance. In practice, this is usually
achieved by balanced network growing and/or by network pruning .
Network growing is a process of successive addition of new neurons and their
related interconnections to the initial small-sized network until the optimal network
performance is reached. This is a common way of designing optimal-sized radial
basis function networks.
Network pruning , again, is a process of successive elimination of less relevant
interconnections between the neurons within the large-sized network until the
further elimination essentially worsens the network performance. A survey of
algorithms to be used for network pruning was given by Reed (1993), who
distinguished two major pruning methods:
x sensitivity calculation methods , based on the sensitivity of the error
function of the trained network with respect to the removal of individual
weight connections as the indication of their pruning
x penalty term methods , based on modification of the error function of a
trained network by a penalty term.
Search WWH ::




Custom Search