Information Technology Reference
In-Depth Information
output space, distances greater than ρ will no longer be taken into account.
The decrease in parameter ρ during training allows the opening, and possibly
the breaking, of certain nonlinear varieties. The projection of a sphere R 3 in
R 2 (Fig. 3.4) shows an example of a variety for which the projection requires a
breaking. The function is therefore used to open certain varieties by retaining
the local topology as far as possible.
Therefore, the objective function minimized by CCA takes the following
form:
p
p
Y ij ) 2 F ( Y ij ) .
E =
( X ij
i =1
j = i +1
3.5.2 Curvilinear Component Analysis Algorithm
The algorithm consists in minimizing the above cost function with respect to
the coordinates of each point in the database in reduced space. As for learning,
we may use any of the optimization algorithms given in Chap. 2. Training can
be performed by any minimization algorithm, as described in Chap. 2. For
illustration, we describe the minimization of the cost function by stochastic
gradient.
Thus, we compute the partial derivatives of the cost function with respect
to each parameter; we denote by y ik
the k −i th coordinate of point i ,
=
j = i
∂E
∂y ik
∂E
∂Y ij
∂Y ij
∂y ik
X ij
Y ij
Y ij ) F ( Y ij )]( y ik
=
[2 F ( Y ij )
( X ij
y jk ) .
Y ij
j = i
Parameters are updated as follows, where µ is the gradient step:
y i = µ
j = i
X ij
Y ij
Y ij ) F ( Y ij )] ( y i
[2 F ( Y ij )
( X ij
y j ) .
Y ij
A condition should be provided to guarantee the convergence of the minimiza-
tion. The term β ij =2 F ( Y ij )
Y ij ) F ( Y ij ) must be positive. If Y ij is
too large with respect to X ij ,point j should be brought closer to point i .The
functions F ( Y ij ) should be selected in order to guarantee β ij > 0. That condi-
tion is di cult to satisfy: for instance, for F ( Y ij )=exp(
( X ij
Y ij ), the stability
requires ρ> ( Y ij −X ij ) / 2. That condition cannot always be fulfilled because
ρ decreases during training. The following simplification of the training rule
guarantees, almost everywhere, that β ij =2 > 0:
µ
j = i
X ij
Y ij
( y i −y j ) f Y ij ;
Y ij
y i =
0
otherwise .
Search WWH ::




Custom Search