Information Technology Reference
In-Depth Information
(
N
(
KD
V
)
3
), which prohibits its application
in LCS, where we require algorithms that preferably scale linearly with the num-
ber of classifiers. Nonetheless, it is of significant theoretical value, as it provides
the values for
V
that maximise (6.3) and can therefore act as a benchmark for
other mixing models and their associated methods.
IRLS algorithm is of complexity
O
6.1.2
Incremental Learning by Least Squares
Following a similar but slightly modified derivation to the one give by Jordan
and Jacobs [121], we can incrementally approximate the maximum of (6.3) by
a recursive least squares procedure that is of lower complexity than the IRLS
algorithm. Due to the convexity of
E
(
V
), its unique minimum is found when its
gradient is
∇
V
E
(
V
)=
0
,thatis,when
V
satisfies
N
(
g
k
(
x
n
)
−
r
nk
)
φ
(
x
n
)=
0
,
k
=1
,...,K.
(6.11)
n
=1
Substituting (6.2) for
g
k
, we want to solve
m
k
(
x
n
)
φ
(
x
n
)=
0
N
exp(
v
k
φ
(
x
n
))
r
nk
m
k
(
x
n
)
j
=1
m
j
(
x
n
)exp(
v
j
φ
(
x
n
))
−
(6.12)
n
=1
Thus, the difference between the left-hand term and the right-hand term inside
the brackets is to be minimised, weighted by
m
k
(
x
n
), such that
exp(
v
k
φ
(
x
n
))
j
=1
m
j
(
x
n
)exp(
v
j
φ
(
x
n
))
≈
r
nk
m
k
(
x
n
)
,
m
k
(
x
n
)
m
k
(
x
n
)
(6.13)
holds for all
n
.Solvingtheabovefor
v
k
φ
(
x
n
), its desired target values is
r
nk
m
k
(
x
n
)
−
ln
ln
C
n
,
(6.14)
where
C
n
=
j
m
j
(
x
n
)exp(
v
j
φ
(
x
n
)) is the normalising term that is common
to all
v
k
φ
(
x
n
) and can therefore be omitted, as it disappears when
v
k
φ
(
x
n
)is
converted to
g
k
(
x
n
). Therefore, the target for
v
k
φ
(
x
k
)isln
r
nk
m
k
(
x
n
)
,weightedby
m
k
(
x
n
). This allows us to reformulate the problem of finding values for
V
that
maximise (6.3) as the
K
linear least squares problems of minimising
m
k
(
x
n
)
v
k
φ
(
x
n
)
2
N
r
nk
m
k
(
x
n
)
−
ln
,
k
=1
,...,K.
(6.15)
n
=1
r
nk
Even though
r
nk
=0if
m
k
(
x
n
) = 0, and therefore
m
k
(
x
n
)
is undefined in
such a case, this does not cause any problems, as in such a case the weight
is equally zero which makes computing the target superfluous. Also note that
Search WWH ::
Custom Search