Information Technology Reference
In-Depth Information
7.3.8 Probabilistic Topological Map
Similarly to
k
-means algorithm, a probabilistic version of SOM, called PR-
SOM, can be defined [Anouar et al. 1997; Gaul et al. 2000]. The difference
between SOM and PRSOM is essentially that, for PRSOM, a Gaussian density
f
c
is associated to each neuron
c
of the map. Each Gaussian density function
f
c
is completely defined by the mean vector (the equivalent of reference vec-
tor of SOM)
w
c
,...,w
c
), and by its covariance matrix that is a
square symmetric positive-definite matrix
Σ
c
, restricted to isotropic densities:
Σ
c
=
σ
2
=(
w
1
c
,w
2
c
I
,where
I
is the (
n,n
) unit matrix. Then the density functions can
be written as
c
exp
.
2
1
(2
π
)
n/
2
σ
c
−
z
−
w
c
f
c
(
z
)=
2
σ
2
c
Thus, in the PRSOM, each neuron
c
of the map is allocated to the mean
vector
w
c
and to the positive scalar
σ
c
. As for SOM, the data space
D
is
partitioned into subsets of the family
. The subset
P
c
is described
by the density function
f
c
:
w
c
represents its associated reference vector,
and
σ
c
estimates the standard deviation of the observation of
P
c
∩
{
P
c
/c
∈
C
}
A
around
w
c
. The two parameter sets
W
=
define
completely the PRSOM. Their values must be estimated during training from
the training set
A
.
If we assume that the data underlying distribution is a Gaussian mixture,
the PRSOM allows an estimating of the parameters of the mixture. A neural
interpretation of PRSOM can be given: the architecture that is associated to
the PRSOM has three layers architecture (Fig. 7.16):
{
w
c
;
c
∈
C
}
and
σ
=
{
σ
c
;
c
∈
C
}
•
Data is presented to the input layer.
•
The map
C
is duplicated into two similar maps
C
1
and
C
2
that have the
same topology as the map
C
in the SOM model. The generic neuron of
maps
C
1
(resp.
C
2
) will be denoted
c
1
(resp.
c
2
).
That approach was first described by Luttrel [Luttrel 1994]; it assumes that
a random propagation occurs forward and backward through the 3 layers of
the network. In the backward direction, from the map to the data space,
that propagation is described by the conditional probabilities
p
(
c
1
|
c
2
)and
p
(
z
|c
1
,c
2
). Moreover, the Markov assumption is postulated, namely that
p
(
z
|c
1
,c
2
)=
p
(
z
|c
1
). Then the probability of each observation
z
can be com-
puted explicitly as
p
(
z
)=
c
2
p
(
c
2
)
p
c
2
,
(
z
)
with
p
c
2
(
z
)=
c
1
p
(
c
1
|
c
2
)
p
(
z
|
c
1
)
.
Search WWH ::
Custom Search