Information Technology Reference
In-Depth Information
f ( x )), where h is some convex
this deviation by the difference measure h ( f ( x )
+ , mixing by a weighted average allows for the derivation of
an upper bound on this difference measure:
function h :
R R
f :
Theorem 6.1. Given the global estimator
X→ R
, that is formed by a
f ( x )= k g k ( x ) f k ( x ) ,
f k :
weighted averaging of K local estimators
X→ R
by
0 for all x and k,and k g k ( x )=1 for all x , the difference
between the target function f :
such that g k ( x )
X→ R
and the global estimator is bounded from
above by
h f ( x )
f ( x )
g k ( x ) h f k ( x )
f ( x ) ,
x
∈X
,
(6.21)
+ is a convex function. More specifically, we have
f ( x ) − f ( x ) 2
where h :
R R
g k ( x ) f k ( x ) − f ( x ) 2
,
x ∈X,
(6.22)
and
f ( x )
g k ( x )
f ( x )
f ( x )
f k ( x )
,
x
∈X
.
(6.23)
Proof. For any x
∈X
,wehave
f ( x ) = h
k
f ( x )
h f ( x )
g k ( x ) f k ( x )
= h
k
f ( x )
g k ( x ) f k ( x )
g k ( x ) h f k ( x )
f ( x ) ,
k
wherewehaveused k g k ( x ) = 1, and the inequality is Jensen's Inequality (for
example, [231]), based on the convexity of h and the weighted average property of
g k . Having proven (6.21), (6.22) and (6.23) follow from the convexity of h ( a )= a 2
and h ( a )=
|
a
|
, respectively.
Therefore, the error of the global estimator can be minimised by assigning high
weights, that is, high values of g k ( x ), to classifiers whose error of the local estima-
tor is small. Observing in (6.18) that the value of g k ( x ) is directly proportional
to the value of γ k ( x ), a good heuristic will assign high values to γ k ( x ) if the error
of the local estimator can be expected to be small. The design of all heuristics
is based on this intuition.
The probabilistic formulation of the LCS model results in a further bound,
this time on the variance of the output prediction:
Theorem 6.2. Given the density p ( y
x , θ ) for output y given input x and para-
meters θ , formed by the K classifier model densities p ( y
|
|
x , θ k ) by p ( y
|
x , θ k )=
k g k ( x ) p ( y
0 for all x and k,and k g k ( x )=1 for
|
x , θ k ) , such that g k ( x )
 
Search WWH ::




Custom Search