Neurocomputing for GeoComputation - GeoComputation

Geoscience Reference

In-Depth Information

simple linear summation of the RBFs. A number of different learning algorithms can be used with

an RBF-CNN. The common algorithm utilises a hybrid learning mechanism that decouples learn-

ing at the hidden layer from that at the output layer. There are two phases. First, in the unsupervised

learning phase, RBF adjustment is implemented in the hidden units using statistical clustering.

This technique involves estimating kernel positions and kernel widths using, for example, a simple

k -means-based clustering algorithm. Second, in the supervised learning phase, adjustment of the

second layer of connections is implemented using linear regression or gradient-descent techniques.

This would involve determining the appropriate connection weights between units in the hidden and

the output layers using, for example, a least mean squares or backpropagation algorithm. Because

the output units are in most cases linear, the application of an initial non-iterative algorithm is

commonplace and often sufficient. However, if need be, a supervised gradient-based algorithm can

also be utilised in a further step to refine the connection parameters. A brief introduction to some

basic mathematics associated with RBF networks can be found in Bishop (2007).

It is worthwhile to note that RBF networks have fast convergence properties and do not suffer

from the problematic effects of local minima. However, when compared with standard backpropa-

gation networks, the training process could indeed be orders of magnitude faster. An important

disadvantage is the fact that RBF networks require more training data and more hidden units to

achieve the same levels of approximation.

13.7.3 art n etwork

ART (adaptive resonance theory) networks differ from the two previous types of network in that

these networks are recurrent. Output from the individual PEs is not just fed forward from input

nodes to output nodes, but it is also fed backward, from output units to input units. ART provides

the basic principles and underlying concepts that are used in these networks (Grossberg 1976a,b).

ART networks were developed as possible models of cognitive phenomena in humans and ani-

mals and thus have more biological association than did our earlier examples. ART makes use of

two important items that are used in the analysis of brain behaviour: stability and plasticity. The

stability-plasticity dilemma concerns the power of a system to preserve the balance between retain-

ing previously learned patterns and learning new patterns.

In simple conceptual terms, an ART network contains two main layers of PEs: a top layer

(output-concept layer F 2 ) and a bottom layer (input-feature layer F 1 ). There are two sets of weighted

connections between each of the nodes in these two layers: top-down weights that represent learned

patterns (expectations) and bottom-up weights that represent a scheme through which the new inputs

can be accommodated. However, in more precise terms, each actual ART implementation could in

fact be disassembled into the following:

•

An input processing field ( F 1 -layer) consisting of two parts: the input portion with input

nodes and the interface portion (interconnections)

•

A layer of linear units ( F 2 -layer) representing prototype vectors whose outputs are acted

on during competitive learning, that is, the winner is the node with a weight vector that is

closest to the input vector (closest in a Euclidean distance sense)

•

Various supplemental units for implementing a reset mechanism to control the degree of

matching for patterns that are to be placed in the same cluster

where the interface portion of the F 1 -layer combines signals from the input portion and the F 2 -layer,

for use in comparing input signals to the weight vector for the cluster that has been selected as a can-

didate for learning. Each individual unit in the F 1 -layer is connected to the F 2 -layer by feedforward

and feedback connections. Changes in the activations of the units and in their weights are governed

by coupled differential equations.

This type of CNN is in essence a clustering tool that is used for the automatic grouping of

unlabeled input vectors into several categories (clusters) such that each input is assigned a label

GeoComputation

Search WWH ::

Custom Search

Home