Self-Organizing Maps and Unsupervised Classification - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

7.3.8 Probabilistic Topological Map

Similarly to k -means algorithm, a probabilistic version of SOM, called PR-

SOM, can be defined [Anouar et al. 1997; Gaul et al. 2000]. The difference

between SOM and PRSOM is essentially that, for PRSOM, a Gaussian density

f c is associated to each neuron c of the map. Each Gaussian density function

f c is completely defined by the mean vector (the equivalent of reference vec-

tor of SOM) w c

,...,w c ), and by its covariance matrix that is a

square symmetric positive-definite matrix Σ c , restricted to isotropic densities:

Σ c = σ 2

=( w 1

,w 2

I ,where I is the ( n,n ) unit matrix. Then the density functions can

be written as

exp

(2 π ) n/ 2 σ c

−

w c

f c ( z )=

2 σ 2

Thus, in the PRSOM, each neuron c of the map is allocated to the mean

vector w c and to the positive scalar σ c . As for SOM, the data space D is

partitioned into subsets of the family

. The subset P c is described

by the density function f c : w c represents its associated reference vector,

and σ c estimates the standard deviation of the observation of P c ∩

{

P c /c

∈

}

A around

w c . The two parameter sets W =

define

completely the PRSOM. Their values must be estimated during training from

the training set A .

If we assume that the data underlying distribution is a Gaussian mixture,

the PRSOM allows an estimating of the parameters of the mixture. A neural

interpretation of PRSOM can be given: the architecture that is associated to

the PRSOM has three layers architecture (Fig. 7.16):

{

w c ; c

∈

}

and σ =

{

σ c ; c

∈

}

•

Data is presented to the input layer.

•

The map C is duplicated into two similar maps C 1 and C 2 that have the

same topology as the map C in the SOM model. The generic neuron of

maps C 1 (resp. C 2 ) will be denoted c 1 (resp. c 2 ).

That approach was first described by Luttrel [Luttrel 1994]; it assumes that

a random propagation occurs forward and backward through the 3 layers of

the network. In the backward direction, from the map to the data space,

that propagation is described by the conditional probabilities p ( c 1 |

c 2 )and

p ( z |c 1 ,c 2 ). Moreover, the Markov assumption is postulated, namely that

p ( z |c 1 ,c 2 )= p ( z |c 1 ). Then the probability of each observation z can be com-

puted explicitly as

p ( z )=

c 2

p ( c 2 ) p c 2 , ( z )

with

p c 2 ( z )=

c 1

p ( c 1

c 2 ) p ( z

c 1 ) .

Neural Networks: Methodology and Applications

Search WWH ::

Custom Search

Home