Discrimination - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

Fig. 6.12. Partial cost corresponding to the Delta rule

be avoided, even if they classify them correctly. This is the aim of algorithms

that look for the hyperplane of a given margin κ , that is, the weights w ( κ )

such that, for all examples k ,

z k

w ≥

γ k

≡

κ.

The hyperplanes closer to the examples than the margin κ can be penalized

through a simple modification of the costs V ( z ), replacing everywhere the

aligned field z k by z k −||

κ . In that case, if the training set is linearly

separable, the solutions of vanishing cost satisfy the above relation for all

the examples. The largest value of κ for which a solution with zero cost

exists defines the maximal stability perceptron . We should point out that in

practice, the procedure that consists in maximizing κ may be complex and

time consuming.

Other cost functions have adjustable parameters more or less equivalent

to κ , generically called hyperparameters. Those allow finding solutions that

have better generalization properties than the above costs.

In general, when the training examples are not linearly separable, the

discriminant surface may be represented with hidden neurons. In that case the

hyperplane defined by each neuron should separate correctly the examples,

at least in a limited neighborhood of the hyperplane. However, when the

examples are not separable, the cost functions presented above have many

local minima. Generally, the solution found by minimizing those costs does

not exhibit the property of local separation. The following partial cost (used

by the algorithm Minimerror described later) allows finding such a solution,

w

||

Search WWH ::

Custom Search

Home