Information Technology Reference
In-Depth Information
categories previously defi ned. SVM models were originally defi ned
for the classifi cation of linearly separable classes of objects (Ivanciuc,
2007). However, SVM can also separate classes that cannot be
separated with a linear classifi er. In this case, the coordinates of the
objects (coordinates are actually networks inputs, that is, independent
parameters) are mapped into a feature space using nonlinear functions
called feature functions ϕ. The feature space is now a high-dimensional
space in which the two classes can be separated with a linear classifi er
(Ivanciuc, 2007). Since the feature space can be highly dimensional, and
it is therefore cumbersome to use its feature functions for classifi cation
purposes, special nonlinear functions called kernel functions were
introduced. Kernel functions represent inner products between the data
in a feature space, and if the kernel is given there is no need to specify
what features of the data are being used (Cristianini, 2001). Kernel
functions are also used in kernel principal component analysis (PCA).
Application of SVMs in regression analysis were described in more detail
in Chapter 4, p. 68-69.
The Generalized Regression Neural Network (GRNN) , often called
the Bayesian network, models the function directly from the training
data, in comparison to parameterized modeling performed using other
types of neural networks where weights are applied to form the
parameters (Ibric et al., 2002). It was introduced by Specht (1991),
who accentuated that GRNN uses a method that frees it from the
necessity of assuming a specifi c functional form; it rather allows the
appropriate form to be expressed as a probability density function.
It was recognized that GRNNs were derived from statistical method of
function approximation (Patterson, 1996). GRNNs always have
exactly four layers: input layer, a layer of so-called radial centers, a layer
of regression units, and an output layer. The radial layer is used to
perform clustering on the known, training data. Clustering algorithms
generally used for training of this layer are sub-sampling, K-means,
or Kohonen (Ibric et al., 2002). The number of clusters, such as radial
units (neurons), can be optimized and usually correspond to the number
of the training samples. The regression layer has one unit (neuron) more
than the output layer and their activation function is linear. One
extra unit (in comparison to the output layer) is used to calculate the
probability density, whereas the remaining units are used to calculate
outputs. In the output layer, a specialized function is applied where the
calculated outputs of the regression layer are divided by the probability
density (Ibric et al., 2002). The fundamental equation of GRNN can be
presented as:
￿
￿
￿
 
Search WWH ::




Custom Search