Geoscience Reference
In-Depth Information
simple linear summation of the RBFs. A number of different learning algorithms can be used with
an RBF-CNN. The common algorithm utilises a hybrid learning mechanism that decouples learn-
ing at the hidden layer from that at the output layer. There are two phases. First, in the unsupervised
learning phase, RBF adjustment is implemented in the hidden units using statistical clustering.
This technique involves estimating kernel positions and kernel widths using, for example, a simple
k -means-based clustering algorithm. Second, in the supervised learning phase, adjustment of the
second layer of connections is implemented using linear regression or gradient-descent techniques.
This would involve determining the appropriate connection weights between units in the hidden and
the output layers using, for example, a least mean squares or backpropagation algorithm. Because
the output units are in most cases linear, the application of an initial non-iterative algorithm is
commonplace and often sufficient. However, if need be, a supervised gradient-based algorithm can
also be utilised in a further step to refine the connection parameters. A brief introduction to some
basic mathematics associated with RBF networks can be found in Bishop (2007).
It is worthwhile to note that RBF networks have fast convergence properties and do not suffer
from the problematic effects of local minima. However, when compared with standard backpropa-
gation networks, the training process could indeed be orders of magnitude faster. An important
disadvantage is the fact that RBF networks require more training data and more hidden units to
achieve the same levels of approximation.
13.7.3 art n etwork
ART (adaptive resonance theory) networks differ from the two previous types of network in that
these networks are recurrent. Output from the individual PEs is not just fed forward from input
nodes to output nodes, but it is also fed backward, from output units to input units. ART provides
the basic principles and underlying concepts that are used in these networks (Grossberg 1976a,b).
ART networks were developed as possible models of cognitive phenomena in humans and ani-
mals and thus have more biological association than did our earlier examples. ART makes use of
two important items that are used in the analysis of brain behaviour: stability and plasticity. The
stability-plasticity dilemma concerns the power of a system to preserve the balance between retain-
ing previously learned patterns and learning new patterns.
In simple conceptual terms, an ART network contains two main layers of PEs: a top layer
(output-concept layer F 2 ) and a bottom layer (input-feature layer F 1 ). There are two sets of weighted
connections between each of the nodes in these two layers: top-down weights that represent learned
patterns (expectations) and bottom-up weights that represent a scheme through which the new inputs
can be accommodated. However, in more precise terms, each actual ART implementation could in
fact be disassembled into the following:
An input processing field ( F 1 -layer) consisting of two parts: the input portion with input
nodes and the interface portion (interconnections)
A layer of linear units ( F 2 -layer) representing prototype vectors whose outputs are acted
on during competitive learning, that is, the winner is the node with a weight vector that is
closest to the input vector (closest in a Euclidean distance sense)
Various supplemental units for implementing a reset mechanism to control the degree of
matching for patterns that are to be placed in the same cluster
where the interface portion of the F 1 -layer combines signals from the input portion and the F 2 -layer,
for use in comparing input signals to the weight vector for the cluster that has been selected as a can-
didate for learning. Each individual unit in the F 1 -layer is connected to the F 2 -layer by feedforward
and feedback connections. Changes in the activations of the units and in their weights are governed
by coupled differential equations.
This type of CNN is in essence a clustering tool that is used for the automatic grouping of
unlabeled input vectors into several categories (clusters) such that each input is assigned a label
Search WWH ::




Custom Search