Self-Organizing Maps and Unsupervised Classification - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

Thus, an iteration can be summarized as follows:

•

Allocation phase: I ( W,χ ) is minimized with respect to the χ ; during that

phase, the reference vectors retain their previous values (or the initial

values for the fist iteration). The minimization is performed when each

observation z i

is allocated to the reference w c

by the allocation function

χ :

2 .

χ ( z )=argmin

r

z

−

w r

(1)

In that relation, r varies from 1 to p (the number of reference vectors). By

allocating the closest reference vector (in the sense of Euclidean distance)

w c to each observation z i , the cost function I ( W,χ ) is minimized. The

new allocation function χ defines a new partition P of the set D (the

closest reference vector has to be understood according to the Euclidean

distance). In the following, n c is the cardinal of the set A

∩

P c .

•

Minimization phase: I ( W,χ ) is minimized with respect to the reference

set W ; the allocation function χ that was computed at the previous step

is kept constant. The cost function I ( W,χ ) is then a convex quadratic

function with respect to W . Its global minimum is reached for

∂W = ∂I

T

∂I

∂ w 1 , ∂I

∂ w 2 ,..., ∂I

=0 .

∂ w p

The computation of the gradient that is associated to each reference vector

w c provides a new set of vector equations

2

z i ∈A

χ ( z i ) = c

( z i −

w c )=0 ,

which define the new reference vectors

1

n c

w c =

z i .

(2)

z i ∈P c A

That algorithm can be proved to converge. If the allocation function that

was computed in the first phase is applied, the class of an observation z

changes only if its contribution to the global inertia that is computed with re-

spect to the reference set W decreases. Therefore, that global inertia is smaller

than the current value of I ( W,χ ). The second phase consists in updating the

reference set W . Each reference vector w c defines the center of inertia of the

observation set P c ∩

A . That requires that I ( W,χ ) decrease, since it is the

inertia with respect to the center of inertia of partition P . When the two

phases are alternatively iterated, the cost function I ( W,χ ) decreases. I ( W,χ )

is expressed as a function of the trace of partition P on the data set A .That

trace is a partition of A . Since the number of partitions of set A is finite, the

iterative process converges to a local minimum of the cost function I ( W,χ )

with respect to the reference set and to the allocation function.

Neural Networks: Methodology and Applications

Search WWH ::

Custom Search

Home