Information Technology Reference
In-Depth Information
3.3.2 Theoretical and Empirical MEE Behaviors
We start by analyzing simple settings with Gaussian inputs and then move
on to more realistic settings. The simple Gaussian-input settings provide the
basic insights on the distinct aspects of theoretical and empirical MEE, in
a theoretically controlled way. More realistic datasets serve to confirm those
insights.
3.3.2.1
Univariate and Bivariate Gaussian Datasets
Let us consider the perceptron with Gaussian inputs and the tanh activation
function. Applying Theorem 3.2, the class-conditional error densities are:
exp
( atanh ( t−e ) ( w T
μ t + w 0 ) ) 2
1
2
w T Σ t w
f E|t ( e )=
2 π w T Σ t w e (2 t
] t− 1 ,t +1[ ( e ) .
(3.43)
e )
We first consider the univariate case ϕ ( w 1 x + w 0 ) with w 1 controlling the
steepness of the activation function; the error density is then
exp
( w 1 μ t + w 0 )) 2
w 1 σ t
(
atanh
( t
e )
1
2
2 πw 1 σ t e (2 t
f E|t ( e )=
1 ,t +1[ ( e ) .
(3.44)
] t
e )
Even for this simple case there is no closed-form expression of H S (or H R 2 ).
One has to resort to numerical integration and apply expressions (C.3) (or
(C.5)). Setting w.l.o.g. ( μ 1 1 )=(0 , 1) we obtain the H S behavior shown
in Fig. 3.14 [212].
Figure 3.14a corresponds to ( μ 1 1 )=(3 , 1). The optimal split point
(the “decision border” in this case) is at x =1 . 5.Weobservethatforsmall
values of w 1 (top figure) H S exhibits a maximum at the optimal split point,
instead of a minimum. A minimum is obtained for a suciently large w 1
(bottom figure). The same behavior is observed in Fig. 3.14b corresponding
to ( μ 1 1 )=(1 , 1) with x =0 . 5. This behavior is, in fact, general for
both H S and H R 2 , and no matter the degree of distribution overlap: the
theoretical MEE perceptron is able to produce the min P e solution.
We now move to the bivariate case, fixing
μ 1 =[0 0] T , Σ t = I , and study
μ 1 =[1 0] T (close
classes). For these two settings the min P e value is 0.0062 and 0.3085, respec-
tively. These min P e values correspond to infinitely many optimal solutions
w =[ w 1 0 w 0 ] T :any( w 1 ,w 0 ) pair s.t.
μ 1 =[5 0] T
two different settings:
(distant classes) and
w 0 /w 1 =2 . 5 and
w 0 /w 1 =0 . 5,
respectively.
 
Search WWH ::




Custom Search