Information Technology Reference
In-Depth Information
Fig. 6.12. Partial cost corresponding to the Delta rule
be avoided, even if they classify them correctly. This is the aim of algorithms
that look for the hyperplane of a given margin κ , that is, the weights w ( κ )
such that, for all examples k ,
z k
w
γ k
κ.
The hyperplanes closer to the examples than the margin κ can be penalized
through a simple modification of the costs V ( z ), replacing everywhere the
aligned field z k by z k −||
κ . In that case, if the training set is linearly
separable, the solutions of vanishing cost satisfy the above relation for all
the examples. The largest value of κ for which a solution with zero cost
exists defines the maximal stability perceptron . We should point out that in
practice, the procedure that consists in maximizing κ may be complex and
time consuming.
Other cost functions have adjustable parameters more or less equivalent
to κ , generically called hyperparameters. Those allow finding solutions that
have better generalization properties than the above costs.
In general, when the training examples are not linearly separable, the
discriminant surface may be represented with hidden neurons. In that case the
hyperplane defined by each neuron should separate correctly the examples,
at least in a limited neighborhood of the hyperplane. However, when the
examples are not separable, the cost functions presented above have many
local minima. Generally, the solution found by minimizing those costs does
not exhibit the property of local separation. The following partial cost (used
by the algorithm Minimerror described later) allows finding such a solution,
w
||
Search WWH ::




Custom Search