Information Technology Reference
In-Depth Information
use very different mechanisms, but still achieve a model
learning objective.
Centers are Maximally
Informative
Data
4.9.1
Algorithms That Use CPCA-Style Hebbian
Learning
There are several different Hebbian learning algorithms
that use a learning rule that is either identical or very
similar to CPCA (equation 4.12). We have already men-
tioned the one that provided the basis for our analysis of
CPCA, the competitive learning algorithm of Rumel-
hart and Zipser (1986). We have also discussed the
other algorithms in our discussion of inhibitory com-
petition functions in chapter 3. These include the “soft”
competitive learning algorithm of Nowlan (1990) and
the Kohonen network of Kohonen (1984). The primary
difference between these other algorithms and CPCA
+ kWTA is in the nature of the kWTA activation dy-
namics, not in the way that the weights are adjusted.
These activation dynamics are a very important part of
the overall learning algorithm, because the activations
determine the limits of what kinds of representations
can be learned, and shape the process of learning.
As we discussed in chapter 3, the kWTA inhibitory
function results in a sparse distributed representation,
whereas the activation functions used in these other
learning algorithms do not. Both competitive learning
and its softer version assume only one unit is active at
a time (i.e., a localist representation), and although the
Kohonen network does have multiple units active, the
activations of all the units are tied directly to a single
winner, and it lacks the kind of combinatorial flexibility
that we saw was important in our exploration of self-
organizing learning.
Weight
Vector
Figure 4.14: The competitive learning algorithm causes the
weight vectors to move toward the centers of input data clus-
ters (adapted from Rumelhart & Zipser, 1986). Linsker (1988)
showed that these centers are maximally informative.
in some sense this is really just another way of looking
at the PCA idea. Importantly, it goes beyond simple
PCA by incorporating the additional assumption that
there are multiple separate clusters, and that different
hidden units should specialize on representing different
clusters. This is the essence of the conditionalizing idea
in CPCA, where a given unit only learns on those pat-
terns that are somehow relevant to its cluster. If you
recall the digit network from the previous chapter, it
should be clear that this kind of clustering would tend
to produce representations of the tightest clusters in the
inputs, which corresponded to the noisy versions of the
same digits. 3
However, when one moves beyond the single active
unit case by using something like the kWTA function,
clustering becomes a somewhat less apt metaphor. Nev-
ertheless, it is possible to think in terms of multiple clus-
ters being active simultaneously (represented by the k
active units).
4.9.2
Clustering
4.9.3
Topography
The competitive learning algorithm provides an inter-
esting interpretation of the objective of self-organizing
learning in terms of clustering (Rumelhart & Zipser,
1986; Duda & Hart, 1973; Nowlan, 1990). Figure 4.14
shows how competitive learning moves the weights to-
ward the centers of the clusters or natural groupings
in the input data. It is easy to see that strongly corre-
lated input patterns will tend to form such clusters, so
The Kohonen network provides another interpretation
of self-organizing learning in terms of the formation of
topographic maps . The idea here is to not only rep-
resent the basic correlational structure of the environ-
ment, but also to repr esent the neighborhood relation-
3 Note also that this kind of clustering has been used in a similar
way as the cluster plots from the previous chapter for condensing and
analyzing high-dimensional data.
Search WWH ::




Custom Search